Data Management

A New Analysis of the False Positive Rate of a Bloom Filter

Date Added: Aug 2010
Format: PDF

A Bloom filter is a space-efficient data structure used for probabilistic set membership testing. When testing an object for set membership, a Bloom filter may give a false positive. The analysis of the false positive rate is a key to understanding the Bloom filter and applications that use it. The authors show experimentally that the classic analysis for false positive rate is wrong. They formally derive a correct formula using a balls-and-bins model and show how to numerically compute the new, correct formula in a stable manner. They also prove that the new formula always results in a predicted greater false positive rate than the classic formula.