Presenting DIPlib 3
DIPimage is a MATLAB toolbox for quantitative image analysis. We’ve got quite a few users, especially in academia. However, few of those users (as far as I know) have ventured down the path of directly using DIPlib, the C library that DIPimage is built upon. I know of two people, outside of the group at Delft University of Technology where we developed DIPlib and DIPimage, that have written C code that uses DIPlib. And that is too bad, because it’s a wonderful library. There are two reasons for this lack of uptake: it has a very steep learning curve, and it is not open source. The second reason makes the first one worse, because there’s very little example source code to look at for learning to use the library.
Back in 2014 I started dreaming of porting DIPlib to C++, and making it open source. Modern C++ is a very expressive
language, and writing code that uses a C++ version of DIPlib doesn’t need to be much more complicated that writing the
equivalent MATLAB code. The port would allow moving some of the innovations we introduced in DIPimage into the DIPlib
library, such as tensor (vector or matrix) images, color space management, etc. I did write a first version of
the dip::Image
class to test and learn how the library could look, and write proposals trying to convince people to
help me build it, but otherwise didn’t put much effort into the project until last year. Over the last year and a half
or so, I have invested a lot of my free time to build a whole new library infrastructure, and port over algorithms. The
work is not nearly finished, but there already is a lot there, and I have been using
it at work in production code. Even though I initially set out to port algorithms unmodified, I find myself improving
code quite frequently, some algorithms are significantly faster than they were before (e.g. the Watershed, which now
uses a correct implementation of Union-Find, and the labelling algorithm (connected component analysis), which now uses
a completely different algorithm).
The DIPlib 3 project consists of various parts:
-
DIPlib, the C++ library. The library infrastructure is complete, and more than half of the image processing/analysis algorithms have been ported from the old DIPlib. Most function names are the same, but I have not tried to maintain backwards compatibility. I did keep a list of changes. The documentation can be found here: https://diplib.github.io/diplib-docs/.
-
DIPviewer, an extension to DIPlib (thanks to Wouter Caarls) to display 2D and 3D images. Not yet documented.
-
DIPimage, the MATLAB toolbox. I’m recreating all functions by directly calling the corresponding DIPlib function (i.e. there’s less overhead because we’re not parsing input arguments in M-code any more), and trying to maintain backwards compatibility as best as makes sense. At the bottom of the page with changes you can see what has changed. The DIPimage GUI hasn’t been moved over yet, but all old functions that are already available in the new DIPlib have been “ported” (maybe half of all toolbox functions, and all of the
dip_image
class methods). -
PyDIP, the Python module. Currently this is composed of a thin wrapper around most C++ functionality. The
dip.Image
object is identical to the C++ counterpart, but also exposes its buffer so that it can be used with e.g. NumPy functions. Conversely, it is possible to use a NumPy array (or other object that exposes its buffer) instead of adip.Image
object as input to PyDIP functions. Maybe, eventually, this module will become more “Pythonic”. Maybe we’ll create a GUI like DIPimage has. We’ll see where this goes.
Some examples
I’d like to give an example of the complexity of the old DIPlib, and the simplicity of the new one. Take the following bit of DIPimage code:
a = readim('cermet');
b = label(a<120);
m = measure(b,a,'Size');
disp(mean(m.Size))
With the old DIPlib, you would write C code like this:
#include "diplib.h"
#include "dipio.h"
#include "dip_point.h"
#include "dip_regions.h"
#include "dip_measurement.h"
int main( int argc, char *argv[] ) {
DIP_FN_DECLARE( "main" );
dip_Resources rg = 0;
dip_Image a, b;
dip_String filename;
dip_IntegerArray featureID;
dip_Measurement m;
DIPXJ( dip_Initialise() );
DIPXJ( dipio_Initialise() );
DIPXJ( dip_ResourcesNew( &rg, 0 ));
DIPXJ( dip_ImageNew( &a, rg ));
DIPXJ( dip_StringNew( &filename, 6, "cermet", rg ));
DIPXJ( dipio_ImageRead( a, filename, 0, 0, 0 ));
DIPXJ( dip_ImageNew( &b, rg ));
DIPXJ( dip_Threshold( a, b, 120, 0, 1, DIP_TRUE ));
DIPXJ( dip_Label( b, b, 8, 0, 0, 0, 0, 0 ));
DIPXJ( dip_IntegerArrayNew( &featureID, 1, dip_FeatureSizeID(), rg ));
DIPXJ( dip_MeasurementNew( &m, rg ));
DIPXJ( dip_Measure( m, featureID, 0, 0, b, a, 8, 0 ));
/* Omitted lots of code here, because there are no convenience functions
* to compute the mean of a measurement. Accessing measurement values is
* convoluted! */
printf( "%f\n", mean_size );
dip_error:
DIPXC( dip_ResourcesFree( &rg ));
DIPXC( dipio_Exit() );
DIPXC( dip_Exit() );
return dip_ErrorWrite( error, 0, 0, stderr );
}
With the new DIPlib, your C++ code would look like this:
#include "diplib.h"
#include "diplib/file_io.h"
#include "diplib/regions.h"
#include "diplib/measurement.h"
int main() {
auto a = dip::ImageReadICS( "cermet" );
auto b = dip::Label( a < 120 );
dip::MeasurementTool measurementTool;
auto m = measurementTool.Measure( b, a, { "Size" } );
std::cout << dip::Mean( m[ "Size" ] );
}
Note that a lot of the typing on the old DIPlib is related to memory management and error management. C++ takes care of
these things automatically. C++ also allows one to not specify types of variables, they are determined from the return
type of the functions. C++ also supports default parameters (meaning not all parameters need to be given in each
function call), automatic casting (meaning it is not necessary to explicitly build a dip_String
object from the file
name string), and overloaded operators (comparison operator for thresholding, indexing operator to retrieve a
measurement result). Besides needing to specify some include files, create a dip::MeasurementTool
object, and specify
the dip
namespace, the code is identical to that in DIPimage.
Here’s a quick example using tensor images. In DIPimage you can write:
g = gradient(img);
S = smooth(g*g', 5);
This creates S
, an image where each pixel is a 2x2 matrix (or 3x3 for a 3D image), and is known as the structure
tensor. The eigenvalues and eigenvectors of each pixel in this image give information about the local structure. With
the new DIPlib you can write the same thing:
auto g = dip::Gradient( img );
auto S = dip::Gauss( g * dip::Transpose( g ), 5 );
The difference here is that S
is explicitly a symmetric matrix image, where only three of the four components (6 of
the 9 in 3D) are computed and stored. The multiplication operator recognizes that its two input arguments are transposed
versions of each other, and thus knows to produce a symmetric output.
These examples were not picked because they are simple to write in the new DIPlib, I could have picked any example and obtain the same effect. The C++ code is not much more complicated than the DIPimage code. This is by design, and I’m very excited about it. This is the reason it is now possible to rewrite most of DIPimage as direct calls to DIPlib, and create PyDIP as a thin wrapper.
The future
Not all functions have been ported yet. There is also a bunch of functionality in the old DIPimage that I’d like to move into the C++ library, and a bunch of functionality that is simply missing altogether and I’d like to implement or borrow. But it is already possible to use the library and the toolbox.
If you’re interested in any parts of this project, feel free to:
- Let me know, so I can keep you updated.
- Download, compile and use the library, MATLAB toolbox or Python module, and let me know what you think.
- Contribute! Any kind of help is appreciated.