Unsupervised Structured Data Extraction from Template-generated Web Pages

Download Now
Provided by: Journal of Universal Computer Science
Topic: Big Data
Format: PDF
In this paper the authors study structured data extraction from template-generated Web pages. Such pages contain most of structured data on the Web. Extracted structured data can be later integrated and reused in very big range of applications, such as price comparison portals, business intelligence tools, various mashups and etc. It encourages industry and academics to seek automatic solutions. To tackle the problem of automatic structured Web data extraction they present a new approach - structured data extraction based on clustering visually similar Web page elements.
Download Now

Find By Topic