Presenting DIPlib 3.0

DIPimage is a MATLAB toolbox for quantitative image analysis. We’ve got quite a few users, especially in academia. However, few of those users (as far as I know) have ventured down the path of directly using DIPlib, the C library that DIPimage is built upon. I know of two people, outside of the group at Delft University of Technology where we developed DIPlib and DIPimage, that have written C code that uses DIPlib. And that is too bad, because it’s a wonderful library. There are two reasons for this lack of uptake: it has a very steep learning curve, and it is not open source. The second reason makes the first one worse, because there’s very little example source code to look at for learning to use the library.

Back in 2014 I started dreaming of porting DIPlib to C++, and making it open source. Modern C++ is a very expressive language, and writing code that uses a C++ version of DIPlib doesn’t need to be much more complicated that writing the equivalent MATLAB code. The port would allow moving some of the innovations we introduced in DIPimage into the DIPlib library, such as tensor (vector or matrix) images, color space management, etc. I did write a first version of the dip::Image class to test and learn how the library could look, and write proposals trying to convince people to help me build it, but otherwise didn’t put much effort into the project until last year. Over the last year and a half or so, I have invested a lot of my free time to build a whole new library infrastructure, and port over algorithms. The work is not nearly finished, but there already is a lot there, and I have been using it at work in production code. Even though I initially set out to port algorithms unmodified, I find myself improving code quite frequently, some algorithms are significantly faster than they were before (e.g. the Watershed, which now uses a correct implementation of Union-Find, and the labelling algorithm (connected component analysis), which now uses a completely different algorithm).

The DIPlib 3.0 project consists of various parts:

  1. DIPlib, the C++ library. The library infrastructure is complete, and more than half of the image processing/analysis algorithms have been ported from the old DIPlib. Most function names are the same, but I have not tried to maintain backwards compatibility. I did keep a list of changes. The documentation can be found here:
  2. DIPviewer, an extension to DIPlib (thanks to Wouter Caarls) to display 2D and 3D images. Not yet documented.
  3. DIPimage, the MATLAB toolbox. I’m recreating all functions by directly calling the corresponding DIPlib function (i.e. there’s less overhead because we’re not parsing input arguments in M-code any more), and trying to maintain backwards compatibility as best as makes sense. At the bottom of the page with changes you can see what has changed. The DIPimage GUI hasn’t been moved over yet, but all old functions that are already available in the new DIPlib have been “ported” (maybe half of all toolbox functions, and all of the dip_image class methods).
  4. PyDIP, the Python module. Currently this is composed of a thin wrapper around most C++ functionality. The dip.Image object is identical to the C++ counterpart, but also exposes its buffer so that it can be used with e.g. NumPy functions. Conversely, it is possible to use a NumPy array (or other object that exposes its buffer) instead of a dip.Image object as input to PyDIP functions. Maybe, eventually, this module will become more “Pythonic”. Maybe we’ll create a GUI like DIPimage has. We’ll see where this goes.

Some examples

I’d like to give an example of the complexity of the old DIPlib, and the simplicity of the new one. Take the following bit of DIPimage code:

a = readim('cermet');
b = label(a<120);
m = measure(b,a,'Size');

With the old DIPlib, you would write C code like this:

#include "diplib.h"
#include "dipio.h"
#include "dip_point.h"
#include "dip_regions.h"
#include "dip_measurement.h"
int main( int argc, char *argv[] ) {
   DIP_FN_DECLARE( "main" );
   dip_Resources rg = 0;
   dip_Image a, b;
   dip_String filename;
   dip_IntegerArray featureID;
   dip_Measurement m;
   DIPXJ( dip_Initialise() );
   DIPXJ( dipio_Initialise() );
   DIPXJ( dip_ResourcesNew( &rg, 0 ));
   DIPXJ( dip_ImageNew( &a, rg ));
   DIPXJ( dip_StringNew( &filename, 6, "cermet", rg ));
   DIPXJ( dipio_ImageRead( a, filename, 0, 0, 0 ));
   DIPXJ( dip_ImageNew( &b, rg ));
   DIPXJ( dip_Threshold( a, b, 120, 0, 1, DIP_TRUE ));
   DIPXJ( dip_Label( b, b, 8, 0, 0, 0, 0, 0 ));
   DIPXJ( dip_IntegerArrayNew( &featureID, 1, dip_FeatureSizeID(), rg ));
   DIPXJ( dip_MeasurementNew( &m, rg ));
   DIPXJ( dip_Measure( m, featureID, 0, 0, b, a, 8, 0 ));
   /* Lots of code here, because there are no convenience functions to compute
      the mean of a measurement. Accessing measurement values is convoluted! */
   printf( "%f\n", mean_size );
  DIPXC( dip_ResourcesFree( &rg ));
  DIPXC( dipio_Exit() );
  DIPXC( dip_Exit() );
  return dip_ErrorWrite( error, 0, 0, stderr );

With the new DIPlib, your C++ code would look like this:

#include "diplib.h"
#include "diplib/file_io.h"
#include "diplib/regions.h"
#include "diplib/measurement.h"
int main() {
   auto a = dip::ImageReadICS( "cermet" );
   auto b = dip::Label( a < 120 );
   dip::MeasurementTool measurementTool;
   auto m = measurementTool.Measure( b, a, { "Size" } );
   std::cout << dip::Mean( m[ "Size" ] );

Note that a lot of the typing on the old DIPlib is related to memory management and error management. C++ takes care of these things automatically. C++ also allows one to not specify types of variables, they are determined from the return type of the functions. C++ also supports default parameters (meaning not all parameters need to be given in each function call), automatic casting (meaning it is not necessary to explicitly build a dip_String object from the file name string), and overloaded operators (comparison operator for thresholding, indexing operator to retrieve a measurement result). Besides needing to specify some include files, create a dip::MeasurementTool object, and specify the dip namespace, the code is identical to that in DIPimage.

Here's a quick example using tensor images. In DIPimage you can write:

g = gradient(img);
S = smooth(g*g',5);

This creates S, an image where each pixel is a 2x2 matrix (or 3x3 for a 3D image), and is known as the structure tensor. The eigenvalues and eigenvectors of each pixel in this image give information about the local structure. With the new DIPlib you can write the same thing:

auto g = dip::Gradient( img );
auto S = dip::Gauss( g * dip::Transpose( g ), 5 );

The difference here is that S is explicitly a symmetric matrix image, where only three of the four components (6 of the 9 in 3D) are computed and stored. The multiplication operator recognizes that its two input arguments are transposed versions of each other, and thus knows to produce a symmetric output.

These examples were not picked because they are simple to write in the new DIPlib, I could have picked any example and obtain the same effect. The C++ code is not much more complicated than the DIPimage code. This is by design, and I'm very excited about it. This is the reason it is now possible to rewrite most of DIPimage as direct calls to DIPlib, and create PyDIP as a thin wrapper.

The future

Not all functions have been ported yet. There is also a bunch of functionality in the old DIPimage that I'd like to move into the C++ library, and a bunch of functionality that is simply missing altogether and I'd like to implement or borrow. But it is already possible to use the library and the toolbox.

If you're interested in any parts of this project, feel free to:

Leave a Reply

You can use these HTML tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Note: I moderate all comments. Comments without a clear relation to the text above will not be published.