The Bloom filter is an interesting data structure for modelling approximate sets of objects. It can tell you with certainty that an object is not in a collection, but may give false positives. If you write Java for a living, I would not suggest you implement your own, because there is a perfectly good implementation […]

# Category archives: Data Structures

## Confusing Sets and Lists

I have often seen the roles of lists and sets confused. An application can be brought to its knees – that is, cease to deliver commercial value – if List.contains is called frequently enough on big enough lists. And then there is the workaround… When I moved over to Java from C++ several years ago, […]

## Roaring TreeMap (English Translation)

A presentation of the Roaring TreeMap data structure translated into English from French Nouveaux modèles d’index bitmap compressés à 64 bits authored by Samy Chambi, Daniel Lemire and Robert Godin. Roaring TreeMap The RoaringTreeMap model combines a Java TreeMap with the Roaring bitmap structure to index a set of 64-bit integers represented by the positions of the set bits of a […]

## Choosing the Right Radix: Measurement or Mathematics?

I recently wrote a post about radix sorting, and found that for large arrays of unsigned integers a handwritten implementation beats Arrays.sort. However, I paid no attention to the choice of radix and used a default of eight bits. It turns out this was a lucky choice: modifying my benchmark to parametrise the radix, I […]

## Sorting Unsigned Integers Faster in Java

I discovered a curious resource for audio-visualising sort algorithms, which is exciting for two reasons. The first is that I finally feel like I understand Alexander Scriabin: he was not a composer. He discovered Tim Sort 80 years before Tim Peters and called it Black Mass. (If you aren’t familiar with the piece, fast-forward to […]