DuDe: The Duplicate Detection Toolkit

Provided by: VLD Digital
Topic: Big Data
Format: PDF
Duplicate detection, also known as entity matching or record linkage, was first defined by the researcher and has been a research topic for several decades. The challenge is to effectively and efficiently identify pairs of records that represent the same real world entity. Researchers have developed and described a variety of methods to measure the similarity of records and/or to reduce the number of required comparisons. Comparing these methods to each other is essential to assess their quality and efficiency.

Find By Topic