May 26th, 2018

A recent question on Stack Overflow really stroke my curiosity. The question was about why MATLAB’s `rgb2ind` function is so much faster at finding a color table than k-means clustering. It turns out that the algorithm in `rgb2ind`, which they call “Minimum Variance Quantization”, is not documented anywhere. A different page in the MATLAB documentation does give away a little bit about this algorithm: it shows a partitioning of the RGB cube with what looks like a k-d tree.

So I ended up thinking quite a bit about how a k-d tree, minimizing variance, and color quantization could intersect. I ended up devising an algorithm that works quite well. It is implemented in DIPlib 3.0 as `dip::MinimumVariancePartitioning` (source code). I’d like to describe the algorithm here in a bit of detail.

Read the rest of this entry »

Tags: clustering, color, DIPlib, k-d tree, k-means, minimum variance, quantization, threshold.

Posted in algorithms, tutorials | No Comments »

March 21st, 2018

The Union-Find data structure is well known in the image processing community because of its use in efficient connected component labeling algorithms. It is also an important part of Kruskal’s algorithm for the minimum spanning tree. It is used to keep track of equivalences: are these two objects equivalent/connected/joined? You can think of it as a forest of trees. The nodes in the trees are the objects. If two nodes are in the same tree, they are equivalent. It is called Union-Find because it is optimized for those two operations, Union (joining two trees) and Find (determining if two objects are in the same tree). Both operations are performed in (essentially) constant time (actually it is O(α(n)), where α is the inverse Ackermann function, which grows extremely slowly and is always less than 5 for any number you can write down).

Here I’ll describe the data structure and show how its use can significantly speed up certain types of operations.

Read the rest of this entry »

Tags: DIPimage, DIPlib, labeling, mathematical morphology, timing, Union-Find, watershed.

Posted in algorithms, analyses, tutorials | 1 Comment »

October 21st, 2017

Since I have spent quite a bit of time porting 25-year old code, I have been confronted with the significant changes to CPU architecture over that time. Code in DIPlib used to be very efficient back then, but some optimizations did not age well at all. I want to show some examples here. It is nice to see that a little bit of effort into optimization can go a long way.

I also want to give a quick example of a highly optimized implementation of an inefficient algorithm, which highlights the difference between algorithms and code.

Read the rest of this entry »

Tags: dilation, DIPimage, DIPlib, Gaussian, mathematical morphology, timing.

Posted in analyses | 6 Comments »

August 20th, 2017

DIPimage is a MATLAB toolbox for quantitative image analysis. We’ve got quite a few users, especially in academia. However, few of those users (as far as I know) have ventured down the path of directly using DIPlib, the C library that DIPimage is built upon. I know of two people, outside of the group at Delft University of Technology where we developed DIPlib and DIPimage, that have written C code that uses DIPlib. And that is too bad, because it’s a wonderful library. There are two reasons for this lack of uptake: it has a very steep learning curve, and it is not open source. The second reason makes the first one worse, because there’s very little example source code to look at for learning to use the library.

Back in 2014 I started dreaming of porting DIPlib to C++, and making it open source. Modern C++ is a very expressive language, and writing code that uses a C++ version of DIPlib doesn’t need to be much more complicated that writing the equivalent MATLAB code. The port would allow moving some of the innovations we introduced in DIPimage into the DIPlib library, such as tensor (vector or matrix) images, color space management, etc. I did write a first version of the `dip::Image`

class to test and learn how the library could look, and write proposals trying to convince people to help me build it, but otherwise didn’t put much effort into the project until last year. Over the last year and a half or so, I have invested a lot of my free time to build a whole new library infrastructure, and port over algorithms. The work is not nearly finished, but there already is a lot there, and I have been using it at work in production code. Even though I initially set out to port algorithms unmodified, I find myself improving code quite frequently, some algorithms are significantly faster than they were before (e.g. the Watershed, which now uses a correct implementation of Union-Find, and the labelling algorithm (connected component analysis), which now uses a completely different algorithm).

Read the rest of this entry »

Tags: DIPimage, DIPlib, PyDIP.

Posted in announcements | 4 Comments »

January 4th, 2017

I’ve never been very active posting here, and since I left academia it has been even slower. All of 2016 passed without a single post! Since I now work for a company, it’s become more difficult for me to post about the fun little things that I work with. Nevertheless, I wanted to share a C++ construct that I came up with to pass flags (a collection of yes/no options) to a function.

Read the rest of this entry »

Tags: bit set, C++11, flags.

Posted in tutorials | 7 Comments »