roaringbitmap

Benchmarking

Stages

The conventional wisdom is that the stages of a task should be pipelined, so you don’t need to wait for the completion of one stage before the next is started. It surprises me that it seems you can sometimes do better when performing each stage of a pipeline in a short batch. Useful optimisation opportunities […]

Read More
Data Structures

Data Driven Logic

I really don’t like reading or writing blocks of if-else statements. They make my eyes glaze over. Rumour has it that processors don’t like executing them either, though that’s less true now than it once was. There are two problems with these blocks of statements, and neither one of them is performance: They are hard […]

Read More
Data Structures

Parallel Bitmap Aggregation

A bitmap index represents predicates over records as sets consisting of the integer identities of each record satisfying each predicate. This representation is actually a few decades out of date, and systems like Pilosa use much more sophisticated data structures, and Sybase had even more on offer back in the 90s. But the chances are, […]

Read More
Benchmarking
Java

Iterating Over a Bitset in Java

How fast can you iterate over a bitset? Daniel Lemire published a benchmark recently in support of a strategy using the number of trailing zeroes to skip over empty bits. I have used the same technique in Java several times in my hobby project SplitMap and this is something I am keen to optimise. I […]

Read More
Data Structures

Building RoaringBitmaps from Streams

RoaringBitmap is a fast compressed bitset format. In the Java implementation of Roaring, it was until recently preferential to build a bitset in one go from sorted data; there were performance penalties of varying magnitude for incremental or unordered insertions. In a recent pull request, I wanted to improve incremental monotonic insertion so I could […]

Read More