Association for Computing Machinery
In this paper, the authors study the problem of privacy preserving record linkage which aims to perform record linkage without revealing anything about the non-linked records. They propose a new secure embedding strategy based on frequent variable length grams which allow record linkage on the embedded space. The frequent grams used for constructing the embedding base are mined from the original database under the framework of differential privacy. Compared with the state-of-the-art secure matching schema, their approach provides formal, provable privacy guarantees and achieves better scalability while providing comparable utility.