Exploiting Database Similarity Joins for Metric Spaces
Similarity Joins are recognized among the most useful data processing and analysis operations and are extensively used in multiple application domains. They retrieve all data pairs whose distances are smaller than a predefined thresh-old ?. Multiple Similarity Join algorithms and implementation techniques have been proposed They range from out-of-database approaches for only in-memory and external memory data to techniques that make use of standard database operators to answer similarity joins. Recent work has shown that this operation can be efficiently implemented as a physical database operator. However, the proposed operator only supports 1D numeric data.