Data Management

Truth Finding on the Deep Web: Is the Problem Solved?

Date Added: Feb 2013
Format: PDF

The amount of useful information available on the Web has been growing at a dramatic pace in recent years and people rely more and more on the Web to fulfill their information needs. In this paper, the authors study truthfulness of Deep Web data in two domains where they believed data are fairly clean and data quality is important to people's lives: Stock and Flight. To their surprise, they observed a large amount of inconsistency on data from different sources and also some sources with quite low accuracy. They further applied on these two data sets state-of-the-art data fusion methods that aim at resolving conflicts and finding the truth, analyzed their strengths and limitations, and suggested promising research directions.