World Academic Union
Bloom filter is a probabilistic and memory efficient data structure designed to answer rapidly whether an element is present in a set. It tells that the element is definitely not in the set but its presence is with certain probability. The trade-off to use Bloom filter is a certain configurable risk of false positives. The odds of a false positive can be made very low if the number of hash function is sufficiently large. For spam detection, weight is attached to each set of elements. The spam weight for a word is a measure used to rate the e-mail. Each word is assigned to a Bloom filter based on its weight.