The Laplacian of Gaussian filter

April 7th, 2019

The Laplacian of Gaussian filter (LoG) is quite well known, but there still exist many misunderstandings about it. In this post I will collect some of the stuff I wrote about it answering questions on Stack Overflow and Signal Processing Stack Exchange.

Read the rest of this entry »

Filling an image with grid points

January 22nd, 2019

Several algorithms require a set of uniformly distributed points across the image. For example, superpixel algorithms typically start with a regular grid. A rectangular grid of points is trivial to draw into an image. One simply generates the set of points of a grid that covers the image, rounds those points to integer coordinates, and sets those pixels. A rotated grid is a bit more challenging–one needs to compute bounds of the rotated grid such that the full image is covered. With other than rectangular grids (for example a hexagonal grid) the math gets a bit more complicated, but this can still be computed. But how to generalize such an algorithm to three dimensions? And to an arbitrary number of dimensions?

I played around with this task for a bit and came up with a very simple, general solution.

Read the rest of this entry »

Color quantization, minimizing variance, and k-d trees

May 26th, 2018

A recent question on Stack Overflow really stroke my curiosity. The question was about why MATLAB’s rgb2ind function is so much faster at finding a color table than k-means clustering. It turns out that the algorithm in rgb2ind, which they call “Minimum Variance Quantization”, is not documented anywhere. A different page in the MATLAB documentation does give away a little bit about this algorithm: it shows a partitioning of the RGB cube with what looks like a k-d tree.

So I ended up thinking quite a bit about how a k-d tree, minimizing variance, and color quantization could intersect. I ended up devising an algorithm that works quite well. It is implemented in DIPlib 3.0 as dip::MinimumVariancePartitioning (source code). I’d like to describe the algorithm here in a bit of detail.

Read the rest of this entry »

Union-Find

March 21st, 2018

The Union-Find data structure is well known in the image processing community because of its use in efficient connected component labeling algorithms. It is also an important part of Kruskal’s algorithm for the minimum spanning tree. It is used to keep track of equivalences: are these two objects equivalent/connected/joined? You can think of it as a forest of trees. The nodes in the trees are the objects. If two nodes are in the same tree, they are equivalent. It is called Union-Find because it is optimized for those two operations, Union (joining two trees) and Find (determining if two objects are in the same tree). Both operations are performed in (essentially) constant time (actually it is O(α(n)), where α is the inverse Ackermann function, which grows extremely slowly and is always less than 5 for any number you can write down).

Here I’ll describe the data structure and show how its use can significantly speed up certain types of operations.

Read the rest of this entry »

Efficient algorithms vs efficient code

October 21st, 2017

Since I have spent quite a bit of time porting 25-year old code, I have been confronted with the significant changes to CPU architecture over that time. Code in DIPlib used to be very efficient back then, but some optimizations did not age well at all. I want to show some examples here. It is nice to see that a little bit of effort into optimization can go a long way.

I also want to give a quick example of a highly optimized implementation of an inefficient algorithm, which highlights the difference between algorithms and code.

Read the rest of this entry »