Data Centers
Data CentersCan We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search
As two important operations in data cleaning, similarity join and similarity search have attracted much attention recently. Existing methods to support similarity join usually adopt a prefix-filtering-based framework. They select a prefix of each object and prune object pairs whose prefixes have no overlap. The authors have an observation that prefix lengths have significant effect ...