If I implement a bloom filter where only one hashing algorithm is used (e.g. murmur), is this still considered a bloom filter?
For example, if a hashes to 5, then bit 5 of the filter will be set. If b hashes to 1, then bit 1 of the filter will be set, and so on...
For something to be considered a bloom filter, do at least two bits in the filter have to be set? If only one bit is set, is it called something else?
It is still a Bloom filter: one with k=1. Depending on the bits per element, it is probably not be the most space-saving one. But there are various reasons why one might pick a k that is not round(bitsPerKey * log(2)), the main ones are:
k=1 is best. See also the paper "Compressed Bloom Filters" from Michael Mitzenmacher.k is faster.By the way, you can still pick k to be the most space-saving one, even if you only use one "application hash function" (like Murmur hash with 64 bits). You just pick the "Bloom hash functions" to be a function of this "application hash function" (64-bit Murmur hash), like so (assuming int is 32 bit and long is 64 bit):
long m = murmur(x)
h(x, i) = (int) (m >> 32) + i * (int) m
And that's actually easier and faster than calculating multiple "application hash functions". In a loop, that looks like this:
long m = murmur(x)
int hash = (int) (m >> 32);
int add = (int) m;
for (int i = 0; i < k; i++) {
... test / set the bit depending on "hash" ...
hash += add;
}
Many Bloom filter libraries do it like this, for example the Bloom filter implementation in Guava.