Download Now Free registration required
The authors present the design of a system for assembling a table from a few example rows by harnessing the huge corpus of information-rich but unstructured lists on the web. The authors developed a totally unsupervised end to end approach which given the sample query rows - retrieves HTML lists relevant to the query from a pre-indexed crawl of web lists, segments the list records and maps the segments to the query schema using a statistical model, consolidates the results from multiple lists into a unified merged table, and presents to the user the consolidated records ranked by their estimated membership in the target relation.
- Format: PDF
- Size: 1071.9 KB