## Measuring boundary length

Oftentimes we segment an image to find objects of interest, and then measure these objects — their area, their perimeter, their aspect ratio, etc. etc. Measuring the area is accomplished simply by counting the number of pixels. But measuring the perimeter is not as simple. If we simply count the number of boundary pixels we seriously underestimate the boundary length. This is just not a good method. A method only slightly more complex can produce an unbiased estimate of boundary length. I will show how this method works in this post. There exist several much more complex methods, that can further improve this estimate under certain assumptions. However, these are too complex to be any fun. I’ll leave those as an exercise to the reader. ðŸ™‚

Because we will examine only the boundary of the object, the chain code representation is the ideal one. What this does, is encode the boundary of the object as a sequence of steps, from pixel to pixel, all around the object. We thus reduce the binary image to a simple sequence of numbers. In future posts I’ll explain a simple algorithm to obtain such a chain code, and show how to use chain codes to obtain other measures. In this post we’ll focus on how to use them to measure boundary length.

The *chain code* was first proposed by H. Freeman (IRE Transactions on Electronic Computers 10(2):260-268, 1961), and therefore sometimes referred to as *Freeman code* or *Freeman chain code*. A chain code simply gives the steps necessary to walk along all the pixels on an object’s boundary, starting at a random one. Instead of saying “move left”, “move diagonally up and right”, etc., it says simply “4”, “1”, etc. The interpretation of the numbers is trivially explained with a figure:

Because we use the chain code to describe an object’s boundary, we will always end up in the same pixel we started: the boundary is closed. The direction in which we walk is not important, though we will consistently walk clock-wise around the shape. The sequence `[0,1,2,7,0,6,5,7,1,0,0,1]` encodes a portion of a boundary as follows:

Using DIPimage, the chain code is obtained with the function `dip_imagechaincode`. It takes a labeled image, a connectivity value (set this to 2!) and the label ID of the object to get the chain code for. For example:

img = label(threshold(gaussf(gaussf))); cc = dip_imagechaincode(img,2,1)

So how do we use this code to compute a boundary length? The pixel counting method, of which I said earlier that it significantly underestimates boundary length, is equivalent to the number of elements in the chain:

p = length(cc.chain);

Why does this underestimate the length? Well, diagonal steps (the odd chain codes) are longer than horizontal or vertical steps (the even chain codes). In the original Freeman paper, it is suggested to add √2 for each odd code, and 1 for each even code:

even = mod(cc.chain,2)==0; p = sum(even) + sum(~even)*sqrt(2);

This computes the exact length of the line that goes from pixel center to pixel center, around the object. However, this is not at all what we are after. The binary shape we obtained through segmentation represents some real-world object, which was sampled and binarized, and it is that object’s perimeter that we want to estimate. It is highly unlikely that that object has the exact same “jagged” outline as our binary shape. Parts of the perimeter that are straight lines at 0, 45 or 90 degrees will be measured correctly, but a straight line at, say, 10 degrees will be overestimated because its binarized representation is jagged. Proffitt and Rosen (Computer Graphics and Image Processing 10(4):318-332, 1979) suggested different weights for even and odd codes, based on a minimum square error for straight lines of infinite length at random orientations:

p = sum(even)*0.948 + sum(~even)*1.340;

Proffitt and Rosen also introduced the concept of *corner count*, the number of times the chain code changes value. As we will see later, this can further reduce the orientation dependency of the method. Vossepoel and Smeulders (Computer Graphics and Image Processing 20(4):347-364, 1982) obtained the following optimal values using the even, odd and corner counts:

corner = cc.chain~=[cc.chain(end),cc.chain(1:end-1)]; p = sum(even)*0.980 + sum(~even)*1.406 - sum(corner)*0.091;

Let’s now examine these methods with a little experiment. First I’ll generate square objects, rotated at various angles between 0 and 90 degrees, and compute their perimeter with the four methods above:

As you can see, the simple pixel counting method works well at 0 and 90 degrees, but severely underestimates the length at other orientations. Freeman’s method is correct at 0, 45 and 90 degrees, but overestimates at other orientations. This is because, as I explained earlier, it follows the jagged line of the discretized shape rather than the straight boundary of the original object. When we use the optimal weights, the graph simply shifts downwards. We are counting both even and odd steps to be slightly shorter. This increases the error for 0, 45 and 90 degrees, but makes the average error smaller, and removes the bias. Finally, when adding the corner count, we add an additional parameter, which means that the measured curve has an additional ripple. This reduces the average error even further.

We can do another experiment to show the effect of sampling density on the measures. It is common to expect the relative error in the estimated perimeter to get smaller as we increase the number of pixels on the perimeter. However, this is only true if the measure is unbiased! Here I’ve generated binary disks and measured their perimeters. The absolute, relative error is plotted in percentages against the disk’s radius (100⋅|*P*-2π*r*|/(2π*r*)):

This graph clearly shows that the biased measures do not benefit from a higher sampling density. They make a 10% and 5% error, respectively, starting at a relatively small radius of about 10 pixels. The two unbiased methods, however, do benefit from the increased sampling density, and the error plots do not fliatten out until a much larger radius (outside the graph!). The corner count method is only slightly better than the method that doesn’t count corners, because on a circle the errors at all orientations cancel each other out. For other shapes, the corner count method will show a larger advantage.

The code to reproduce the two graphs above can be fetched right here. Remember, it requires MATLAB with DIPimage.

In future posts I’ll explain the algorithm that extracts the chain code from a binarized object, as well as methods to compute the minimum bounding rectangle and maximum object length, for example. You can even compute an object’s area using only its chain code!

[…] a recent post Cris Luengo presesnts his view of the issues with some experimental data. The diagram below shows […]

The article is very resourceful. Thanks.

Can i get the full source code for the above..

cdh,

You can find MATLAB source code to get the chain code on the post I wrote after this one. The link is at the top of the page.

Hi Sir,

I find the chain code with original Matlab code.

But I dont know how I can get the “corner count” with the orginal Matlab code?

Can you describe a little more detailed what is “corner countt” and how I can get the numbers of “corner” with original Matlab code??

Thanks a lot

Turi,

I’m not sure what you mean with “the original Matlab code”. The line that computes corner count is quite simple standard Matlab syntax. You just count how many times two consecutive chain codes are different.

Hi Sir ,

I meant with “original Matlab code” that I didnt use Dipimage Toolbox. I compute corner count with a “if” loop. For example c is chain code and k is number of objects

If I compute corners of a square with this loop ( I mean if I count how many times two consecutive chain codes are different ) the answer is 3

chain code= 0000666644442222 3 times two consecutive chain codes are different

Should I write in formula 3 or 4 ???

p = 16*0.980 + – 3*0.091 Is it true ??? Or p=16*0.980+4*0.091 ???

Thanks a lot

Turi,

The line

`corner = cc.chain~=[cc.chain(end),cc.chain(1:end-1)];`

doesn’t use DIPimage. Just replace

`cc.chain`

with`c{k}`

!You will also see that this code also checks to see if the last and the first codes are different. Remember that the chain code goes all the way around the shape, forming a closed loop. So the corner count in your example is 4.

Hi Sir,

First of all I want to say that im very thankful for your answers.

I have read the paper of Profitt and Rosen but I couldnt be sure whether the diffrent weights (0.980 and 1.340) is also same for 8 connectivity

I find chain code with the command bwboundaries (~,connectivity,*holes or noholes*) and use 8 for connectivity

Can I use also these coefficients for 8 way coding?

Thanks a lot

Turi,

Those weights are meant for 8-connected chain codes. If you use 4-connected chain codes, you don’t have diagonal steps, which are the odd steps in the text above, and thus there would be no need for the 1.340 weight. There is no point in using 4-connected chain codes to measure boundary length, it will be very inaccurate.

hello sir,

am working in Visual c++…Can u tel me how to find the area of an object.Histogram will give the number of pixels belonging to each gray level value..but how to obtain the total count of all the pixels.Can you tell the code.

thank you

Michael,

I don’t have time to write code for you. But I can tell you that the number of pixels belonging to an object corresponds to its area. Thus, if you have a labelled image (where all pixels belonging to one object have the same grey value, distinct from the grey value of pixels in other objects), then the histogram count for that grey value will correspond to the area of that object.

Hi Sir;

I read some papers and Zenon Kulpa mentioned that the area or perimeter measurement are related to digitization scheme. I try to find the length of interfacial area in a polymermixing. I used thresholding method for object extraction . Is pixel counting method suitable for area measurement ?

Is there any conclusion of research works about relations between properties of the corresponding discrete objets and their nondiscrete originals ? Or are the Zenon Kulpas conclusions still valid ?

Best Regards

Turi,

Of course the way that the continuous world is discretised influences the way measurements should be performed. In this post I assume point sampling. Most images you come across nowadays approach ideal point sampling. If you use thresholding to determine which pixels are part of your object, then you can use pixel counting as an unbiased estimate of area.

[…] The Chain Code algorithm (used by OpenCV’s findContours function): Seems pertty good as well, and parallel solutions do exist, but I worry it might fail if the stopping criterion is not good enough (see here, at the bottom near Jacob’s stopping criterion). However, it should be able to extract only the outer contour and give good approximations. […]

Hey Cris,

Isn’t the coefficient for ‘odd’ orientations 0.426 instead of 1.406? Referencing the equation on page 362 of Vossepoel and Smeulders. Let me know your thoughts if you get a chance, I’m trying to analyze microparticle images in which the particle radius is approximately 6 pixels.

Cheers! Connor ðŸ™‚

Connor,

That paper [Vossepoel and Smeulders, 1982] uses

nas the total number of chain codes, andmas the number of odd codes. They writea+_{n}nb+_{n}mc. So you add_{n}n_{c}afor each even code, and_{n}a+_{n}bfor each odd code. If you take the constants from the bottom row of Table 3, even codes get 0.980, and odd codes get 0.980 + 0.426 = 1.406._{n}Compare to the Euclidean case, where even codes get 1 and odd codes get √2. An optimized method should have similar values to those. If you were to give odd codes only 0.426, you’d be undervaluing their contribution significantly! Why would a diagonal step be shorter than a horizontal step?

Dear Cris

Thanks for information provided. i have one question can i modify the equation provided by Vossepoel and Smeulders in which i will consider the number of times the same number chain code occurs ?

Radha,

That’s an interesting thought, I hadn’t considered that before.

If you have a sequence of one repeated code, you have a straight line along one of the axes or at 45 degrees from an axis (e.g. 0,0,0,0,0 or 1,1,1,1,1). Taking these into account could improve quantification for shapes with straight edges that are aligned with the sampling grid (such as rectangles, diamond and octagon shapes). But for lines at any other orientation you would get more complicated sequences. These are difficult to identify.

For generic shapes with long straight edges at arbitrary angles, an approach could be to record the outline as a polygon (i.e. keep the pixel’s coordinates rather than encoding as a chain code), and simplify this polygon using the Douglasâ€“Peucker algorithm. The allowed error should then be half a pixel (so that sequences of chain codes like 0,1,0,1,0,1,0,1 or 3,4,3,4,3,4,3,4 become a single polygon edge). It is not clear to me that this would result in a correct object outline, but it

mightproduce a more precise result. You’d have to test to make sure this is actually the case. Let us know if you do!Cris i modified the Vossepoel and Smeulders equation by considering the number of times the non-corner points occur.

perimeter =count(even)âˆ—0.980+count(odd)âˆ—1.406âˆ’count(corner)âˆ—0.091+count(noncorner)âˆ—0.000132

where noncorner is the number of times the non-corner points occur. i have the graph which shows that absolute error decreases with this modified equation. is there any way i can attach the graph here ?

Radhika,

Feel free to upload your graph to one of the free image hosting services (e.g. imgur.com) and link it here. Two things:

– Did you optimize this value based on the square shapes I use in this example here, or did you consider other shapes as well? I recommend you look through the Vossepoel and Smeulders paper to see how they did their optimization, and replicate that.

– Did you optimize only this new value, keeping the others constant? You should consider optimizing all 4 constants in your equation simultaneously.

One thing that just occurred to me is that count(noncorner) could be equal to total-count(corner), is it? — If so, this count cannot be adding any new information. I also notice that the new constant has zeros for the first three decimal places, which is the number of places at which the other constants have been rounded to. You might have recovered some of the lost precision from the other constants?