Measuring boundary length

Oftentimes we segment an image to find objects of interest, and then measure these objects — their area, their perimeter, their aspect ratio, etc. etc. Measuring the area is accomplished simply by counting the number of pixels. But measuring the perimeter is not as simple. If we simply count the number of boundary pixels we seriously underestimate the boundary length. This is just not a good method. A method only slightly more complex can produce an unbiased estimate of boundary length. I will show how this method works in this post. There exist several much more complex methods, that can further improve this estimate under certain assumptions. However, these are too complex to be any fun. I’ll leave those as an exercise to the reader. 🙂

Because we will examine only the boundary of the object, the chain code representation is the ideal one. What this does, is encode the boundary of the object as a sequence of steps, from pixel to pixel, all around the object. We thus reduce the binary image to a simple sequence of numbers. In future posts I’ll explain a simple algorithm to obtain such a chain code, and show how to use chain codes to obtain other measures. In this post we’ll focus on how to use them to measure boundary length.

The chain code was first proposed by H. Freeman (IRE Transactions on Electronic Computers 10(2):260-268, 1961), and therefore sometimes referred to as Freeman code or Freeman chain code. A chain code simply gives the steps necessary to walk along all the pixels on an object’s boundary, starting at a random one. Instead of saying “move left”, “move diagonally up and right”, etc., it says simply “4”, “1”, etc. The interpretation of the numbers is trivially explained with a figure:

Chain codes

Because we use the chain code to describe an object’s boundary, we will always end up in the same pixel we started: the boundary is closed. The direction in which we walk is not important, though we will consistently walk clock-wise around the shape. The sequence [0,1,2,7,0,6,5,7,1,0,0,1] encodes a portion of a boundary as follows:

Encoding a boundary

Using DIPimage, the chain code is obtained with the function dip_imagechaincode. It takes a labeled image, a connectivity value (set this to 2!) and the label ID of the object to get the chain code for. For example:

img = label(threshold(gaussf(gaussf)));
cc = dip_imagechaincode(img,2,1)

So how do we use this code to compute a boundary length? The pixel counting method, of which I said earlier that it significantly underestimates boundary length, is equivalent to the number of elements in the chain:

p = length(cc.chain);

Why does this underestimate the length? Well, diagonal steps (the odd chain codes) are longer than horizontal or vertical steps (the even chain codes). In the original Freeman paper, it is suggested to add √2 for each odd code, and 1 for each even code:

even = mod(cc.chain,2)==0;
p = sum(even) + sum(~even)*sqrt(2);

This computes the exact length of the line that goes from pixel center to pixel center, around the object. However, this is not at all what we are after. The binary shape we obtained through segmentation represents some real-world object, which was sampled and binarized, and it is that object’s perimeter that we want to estimate. It is highly unlikely that that object has the exact same “jagged” outline as our binary shape. Parts of the perimeter that are straight lines at 0, 45 or 90 degrees will be measured correctly, but a straight line at, say, 10 degrees will be overestimated because its binarized representation is jagged. Proffitt and Rosen (Computer Graphics and Image Processing 10(4):318-332, 1979) suggested different weights for even and odd codes, based on a minimum square error for straight lines of infinite length at random orientations:

p = sum(even)*0.948 + sum(~even)*1.340;

Proffitt and Rosen also introduced the concept of corner count, the number of times the chain code changes value. As we will see later, this can further reduce the orientation dependency of the method. Vossepoel and Smeulders (Computer Graphics and Image Processing 20(4):347-364, 1982) obtained the following optimal values using the even, odd and corner counts:

corner = cc.chain~=[cc.chain(end),cc.chain(1:end-1)];
p = sum(even)*0.980 + sum(~even)*1.406 - sum(corner)*0.091;

Let’s now examine these methods with a little experiment. First I’ll generate square objects, rotated at various angles between 0 and 90 degrees, and compute their perimeter with the four methods above:

Estimating line length using chain codes

As you can see, the simple pixel counting method works well at 0 and 90 degrees, but severely underestimates the length at other orientations. Freeman’s method is correct at 0, 45 and 90 degrees, but overestimates at other orientations. This is because, as I explained earlier, it follows the jagged line of the discretized shape rather than the straight boundary of the original object. When we use the optimal weights, the graph simply shifts downwards. We are counting both even and odd steps to be slightly shorter. This increases the error for 0, 45 and 90 degrees, but makes the average error smaller, and removes the bias. Finally, when adding the corner count, we add an additional parameter, which means that the measured curve has an additional ripple. This reduces the average error even further.

We can do another experiment to show the effect of sampling density on the measures. It is common to expect the relative error in the estimated perimeter to get smaller as we increase the number of pixels on the perimeter. However, this is only true if the measure is unbiased! Here I’ve generated binary disks and measured their perimeters. The absolute, relative error is plotted in percentages against the disk’s radius (100⋅|P-2πr|/(2πr)):

Estimating circle perimeter using chain codes

This graph clearly shows that the biased measures do not benefit from a higher sampling density. They make a 10% and 5% error, respectively, starting at a relatively small radius of about 10 pixels. The two unbiased methods, however, do benefit from the increased sampling density, and the error plots do not fliatten out until a much larger radius (outside the graph!). The corner count method is only slightly better than the method that doesn’t count corners, because on a circle the errors at all orientations cancel each other out. For other shapes, the corner count method will show a larger advantage.

The code to reproduce the two graphs above can be fetched right here. Remember, it requires MATLAB with DIPimage.

In future posts I’ll explain the algorithm that extracts the chain code from a binarized object, as well as methods to compute the minimum bounding rectangle and maximum object length, for example. You can even compute an object’s area using only its chain code!

17 Responses to “Measuring boundary length”

  1. On October 14th, 2010, at 13:48, Computer Vision For Dummies » Lengths of curves, revisited said:

    […] a recent post Cris Luengo presesnts his view of the issues with some experimental data. The diagram below shows […]

  2. On May 14th, 2012, at 17:18, Aaron Angel said:

    The article is very resourceful. Thanks.

  3. On April 2nd, 2014, at 19:48, cdh said:

    Can i get the full source code for the above..

  4. On April 3rd, 2014, at 8:06, Cris Luengo said:

    cdh,

    You can find MATLAB source code to get the chain code on the post I wrote after this one. The link is at the top of the page.

  5. On April 10th, 2014, at 17:56, Turi said:

    Hi Sir,

    I find the chain code with original Matlab code.

    But I dont know how I can get the “corner count” with the orginal Matlab code?

    Can you describe a little more detailed what is “corner countt” and how I can get the numbers of “corner” with original Matlab code??

    Thanks a lot

  6. On April 10th, 2014, at 23:03, Cris Luengo said:

    Turi,

    I’m not sure what you mean with “the original Matlab code”. The line that computes corner count is quite simple standard Matlab syntax. You just count how many times two consecutive chain codes are different.

  7. On April 14th, 2014, at 16:01, Turi said:

    Hi Sir ,

    I meant with “original Matlab code” that I didnt use Dipimage Toolbox. I compute corner count with a “if” loop. For example c is chain code and k is number of objects

    for k=1:length (c);
        boundary=c{k};
         for n=2:1:length(boundary)
                if boundary(n)~= boundary(n-1);
                    corner=corner+1;
                end
        end
    end
    

    If I compute corners of a square with this loop ( I mean if I count how many times two consecutive chain codes are different ) the answer is 3

    chain code= 0000666644442222 3 times two consecutive chain codes are different

    Should I write in formula 3 or 4 ???
    p = 16*0.980 + – 3*0.091 Is it true ??? Or p=16*0.980+4*0.091 ???

    Thanks a lot

  8. On April 14th, 2014, at 16:58, Cris Luengo said:

    Turi,

    The line

    corner = cc.chain~=[cc.chain(end),cc.chain(1:end-1)];

    doesn’t use DIPimage. Just replace cc.chain with c{k}!

    You will also see that this code also checks to see if the last and the first codes are different. Remember that the chain code goes all the way around the shape, forming a closed loop. So the corner count in your example is 4.

  9. On April 21st, 2014, at 10:48, Turi said:

    Hi Sir,

    First of all I want to say that im very thankful for your answers.

    I have read the paper of Profitt and Rosen but I couldnt be sure whether the diffrent weights (0.980 and 1.340) is also same for 8 connectivity

    I find chain code with the command bwboundaries (~,connectivity,*holes or noholes*) and use 8 for connectivity

    Can I use also these coefficients for 8 way coding?

    Thanks a lot

  10. On April 22nd, 2014, at 9:42, Cris Luengo said:

    Turi,

    Those weights are meant for 8-connected chain codes. If you use 4-connected chain codes, you don’t have diagonal steps, which are the odd steps in the text above, and thus there would be no need for the 1.340 weight. There is no point in using 4-connected chain codes to measure boundary length, it will be very inaccurate.

  11. On April 27th, 2014, at 19:28, Michael said:

    hello sir,

    am working in Visual c++…Can u tel me how to find the area of an object.Histogram will give the number of pixels belonging to each gray level value..but how to obtain the total count of all the pixels.Can you tell the code.

    thank you

  12. On April 28th, 2014, at 10:26, Cris Luengo said:

    Michael,

    I don’t have time to write code for you. But I can tell you that the number of pixels belonging to an object corresponds to its area. Thus, if you have a labelled image (where all pixels belonging to one object have the same grey value, distinct from the grey value of pixels in other objects), then the histogram count for that grey value will correspond to the area of that object.

  13. On June 4th, 2014, at 0:42, Turi said:

    Hi Sir;

    I read some papers and Zenon Kulpa mentioned that the area or perimeter measurement are related to digitization scheme. I try to find the length of interfacial area in a polymermixing. I used thresholding method for object extraction . Is pixel counting method suitable for area measurement ?

    Is there any conclusion of research works about relations between properties of the corresponding discrete objets and their nondiscrete originals ? Or are the Zenon Kulpas conclusions still valid ?

    Best Regards

  14. On June 4th, 2014, at 9:04, Cris Luengo said:

    Turi,

    Of course the way that the continuous world is discretised influences the way measurements should be performed. In this post I assume point sampling. Most images you come across nowadays approach ideal point sampling. If you use thresholding to determine which pixels are part of your object, then you can use pixel counting as an unbiased estimate of area.

  15. On September 5th, 2014, at 5:48, Sequentially/parallel algorithm for extracting blob outer perimeter/contour length | Question and Answer said:

    […] The Chain Code algorithm (used by OpenCV’s findContours function): Seems pertty good as well, and parallel solutions do exist, but I worry it might fail if the stopping criterion is not good enough (see here, at the bottom near Jacob’s stopping criterion). However, it should be able to extract only the outer contour and give good approximations. […]

  16. On July 15th, 2016, at 0:30, Connor said:

    Hey Cris,

    Isn’t the coefficient for ‘odd’ orientations 0.426 instead of 1.406? Referencing the equation on page 362 of Vossepoel and Smeulders. Let me know your thoughts if you get a chance, I’m trying to analyze microparticle images in which the particle radius is approximately 6 pixels.

    Cheers! Connor 🙂

  17. On July 15th, 2016, at 5:47, Cris said:

    Connor,

    That paper [Vossepoel and Smeulders, 1982] uses n as the total number of chain codes, and m as the number of odd codes. They write ann + bnm + cnnc. So you add an for each even code, and an + bn for each odd code. If you take the constants from the bottom row of Table 3, even codes get 0.980, and odd codes get 0.980 + 0.426 = 1.406.

    Compare to the Euclidean case, where even codes get 1 and odd codes get √2. An optimized method should have similar values to those. If you were to give odd codes only 0.426, you’d be undervaluing their contribution significantly! Why would a diagonal step be shorter than a horizontal step?

Leave a Reply

You can use these HTML tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Note: I moderate all comments. Comments without a clear relation to the text above will not be published.