Efficient Similarity Search on Distribution Data

Distribution data naturally arise in countless domains, such as meteorology, biology, geology, industry and economics. However, relatively little attention has been paid to data mining for large distribution sets. Given n distributions of multiple categories and a query distribution Q, the authors want to find similar clouds (i.e., distributions), to discover patterns, rules and outlier clouds. They propose to address this problem and present D-Search, an efficient algorithm for similarity search in large distribution datasets.

Provided by: Association for Computing Machinery Topic: Big Data Date Added: Jun 2009 Format: PDF

Find By Topic