Recovering Semantics of Tables on the Web
The Web offers a corpus of over 100 million tables, but the meaning of each table is rarely explicit from the table itself. Header rows exist in few cases and even when they do, the attribute names are typically useless. The authors describe a system that attempts to recover the semantics of tables by enriching the table with additional annotations. Their annotations facilitate operations such as searching for tables and finding related tables. To recover semantics of tables, they leverage a database of class labels and relationships automatically extracted from the Web. The database of classes and relationships has very wide coverage, but is also noisy.