Stop Word and Related Problems in Web Interface Integration

Date Added: Aug 2009
Format: PDF

The goal of recent research projects on integrating Web databases has been to enable uniform access to the large amount of data behind query interfaces. Among the tasks addressed are: source discovery, query interface extraction, schema matching, etc. There are also a number of tasks that are commonly ignored or assumed to be apriori solved either manually or by some oracle. These tasks include finding the set of stop words and handling occurrences of "Semantic enrichment words" within labels. These two sub-problems have a direct impact on determining the synonymy and hyponymy relationships between labels.