A Survey on HTML Structure Aware and Tree Based Web Data Scraping Technique

Provided by: Creative Commons Topic: Big Data Format: PDF
Vast amount of information is available on web. Data analysis applications such as extracting mutual funds information from a website, daily extracting opening and closing price of stock from a web page involves web data extraction. Huge efforts are made by lots of researchers to automate the process of web data scraping. Lots of techniques depends on the structure of web page i.e. html structure or DOM tree structure to scrap data from web page. In this paper the authors are presenting survey of HTML aware web scrapping techniques.

Find By Topic