Two Bits Are Better Than One: making bloom filters 2x more accurate
(floedb.ai)
from cm0002@lemy.lol to programming@programming.dev on 22 Feb 17:16
https://lemy.lol/post/61529207
from cm0002@lemy.lol to programming@programming.dev on 22 Feb 17:16
https://lemy.lol/post/61529207
#programming
threaded - newest
The headline is misleading if you are familiar with bloom filters.
TL;DR: the interesting thing here isn’t decreased false positive rate (multibit bloom filters are common), but the idea to put the relevant bits together. Basically you use a hash to pick a chunk of bits (32 bits in this case), then use more hashes to pick the bits within this chunk.
It is a tradeoff between accuracy (completely independent hashes would be less likely to have collisions leading to false positives) and performance (all relevant bits for the object you’re looking up will be together and the lookup will trigger at most one cache miss / memory access).