Tag: bloom filter

Blocked Signatures

The interesting thing about BitFunnel, the search architecture used by Bing, is its unorthodoxy; it revisits many ideas ignored by contemporary search technologies. The paper’s references go back to the 80s, when one million records was considered an enormous database. Something about this appeals to me, that there might be long forgotten ideas waiting to

Bit-Sliced Signatures and Bloom Filters

While the inverted index is a familiar indexing structure for many with a casual interest in information retrieval, the concept of a document signature may not be. The idea seems to have a long history, having been popular until it lost favour in the late 90s, owing to file sizes and performance empirically inferior to

Building a Bloom Filter from Scratch

The Bloom filter is an interesting data structure for modelling approximate sets of objects. It can tell you with certainty that an object is not in a collection, but may give false positives. If you write Java for a living, I would not suggest you implement your own, because there is a perfectly good implementation