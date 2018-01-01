Download Now Download Now Provided by: VLD Digital Topic: Big Data Date Added: Aug 2012 Format: PDF

Given a collection of objects and an associated similarity measure, the all-pairs similarity search problem asks the authors to find all pairs of objects with similarity greater than a certain user-specified threshold. Locality-Sensitive Hashing (LSH) based methods have become a very popular approach for this problem. However, most such methods only use LSH for the first phase of similarity search - i.e. efficient indexing for candidate generation. In this paper, they present BayesLSH, a principled bayesian algorithm for the subsequent phase of similarity search - performing candidate pruning and similarity estimation using LSH.