1 An Algorithm for Binary Image Segmentation Using Polygonal Markov Fields Rafa l Kluszczyński 1, Marie-Colette van Lieshout 2, and Tomasz Schreiber 1...

Author:
Robyn Martina Sharp

0 downloads 56 Views 605KB Size

Severity: Notice

Message: Undefined index: description

Filename: shared/document_item_2.php

Line Number: 14

Backtrace:

File: /home/zdoc.pub/public_html/application/views/shared/document_item_2.php

Line: 14

Function: _error_handler

File: /home/zdoc.pub/public_html/application/views/document.php

Line: 109

Function: view

File: /home/zdoc.pub/public_html/application/controllers/Document.php

Line: 142

Function: view

File: /home/zdoc.pub/public_html/index.php

Line: 355

Function: require_once

Nicolaus Copernicus University, ul. Chopina 12-18, 87-100 Toru´ n, Poland [email protected], [email protected] 2 CWI, Kruislaan 413, 1098 SJ Amsterdam, The Netherlands [email protected]

Abstract. We present a novel algorithm for binary image segmentation based on polygonal Markov ﬁelds. We recall and adapt the dynamic representation of these ﬁelds, and formulate image segmentation as a statistical estimation problem for a Gibbsian modiﬁcation of an underlying polygonal Markov ﬁeld. We discuss brieﬂy the choice of Hamiltonian, and develop Monte Carlo techniques for ﬁnding the optimal partition of the image. The approach is illustrated by a range of examples.

1

Introduction

One of the fundamental image analysis tasks is that of segmentation, i.e. to partition the image in relatively homogeneous regions [10]. Indeed, segmenting the data is often the ﬁrst step in image interpretation problems. The partition may be achieved at several conceptual levels. At the lowest level, that of individual pixels, segmentation amounts to classiﬁcation of pixel values. At the other extreme, the focus of attention are the objects that constitute a given image and the goal is to extract them from the image. A myriad of segmentation methods has been proposed, from elementary thresholding through level set approaches and contour extraction methods to scene modelling. In this paper, we propose to use polygonal ﬁeld models. Thus, we place ourselves at the intermediate conceptual level that regards a segmentation as a coloured tessellation [8]. The advantage of such an approach is that - in contrast to pixel based ones - global aspects of the image are captured. At the same time there is no need to model all objects in the image, which is feasible in restricted application domains only. Furthermore, a coloured polygonal tessellation is a reasonable and widely applicable mathematical formalisation of the intuitive concept of a segmentation, especially when compared to the rather artiﬁcial level set model or to the notion of a collection of pixel labels that do not necessarily have any spatial coherence. The idea can be traced back to Cliﬀord and Middleton [6]; from a computational point of view, a Metropolis– Hastings style sampler was developed by Cliﬀord and Nicholls [7] and applied to an image reconstruction problem within a Bayesian framework. We shall use a modiﬁcation of the algorithm in Schreiber [12] which is conceptually and computationally easier. F. Roli and S. Vitulano (Eds.): ICIAP 2005, LNCS 3617, pp. 383–390, 2005. c Springer-Verlag Berlin Heidelberg 2005

384

2

R. Kluszczy´ nski, M.-C. van Lieshout, and T. Schreiber

Preliminaries and Notation

We aim to describe the contents of an image domain D, assumed to be an open, convex and bounded subset of the plane (typically a rectangle), by means of a family of non-intersecting polygonal contours in D, possibly nested or chopped oﬀ by the boundary. We restrict ourselves to foreground/background segmentation, so that a contour may be interpreted as a polygonal boundary between black (foreground) and white (background) regions. We shall use the notation y¯ for a discretised image, S ⊂ D for the pixel set. The value ys at pixel s belongs to some ﬁnite set L. A collection of contours is denoted by γ. For each collection, there are exactly two admissible black and white colourings. We use a hat notation, γˆ , to describe a family of non-intersecting polygonal contours with its associated colouring. Our approach, as developed in [12] and [9], involves the following building blocks. 1. The ﬁrst ingredient is, intuitively speaking, the ability to generate a ’completely random’ polygonal ﬁeld, both as a benchmark or reference ﬁeld, and as a tool for exploring the space of admissible polygonal conﬁgurations without favouring any particular one. A reasonable choice is the so-called Arak process [1,2,3,4], denoted by AD . 2. Secondly, we need a goodness-of-ﬁt measure H(¯ y | γˆ ) to quantify how well a coloured polygonal conﬁguration γˆ matches the data image y¯. Moreover, we would like to be able to inﬂuence the geometry of γˆ by assigning a higher probability to large polygons with smooth boundaries that do not have spurious edges. This is captured by a so-called regularisation term H(¯ y ), cf. Section 3 in [9]. 3. The third and last ingredient is an updating mechanism which keeps the distribution of the reference Arak ﬁeld invariant while exhibiting good exploratory properties in the conﬁguration space. To this end, we develop the so-called disagreement-loop birth and death algorithm, originally introduced in [12] and further extended in [9]. This mechanism can then be combined with standard Metropolis and simulated annealing techniques to ﬁnd an optimal segmentation γˆ . The next three sections will elaborate a little further on the above points. A full account can be found in [12,9].

3

The Arak Process: Dynamic Representation

The crucial idea underlying Arak’s construction [1] is to interpret the polygonal boundaries of the ﬁeld as the trace left by a particle travelling in two-dimensional time-space. Thus, the two-dimensional image domain D is seen as a set of timespace points (t, y) ∈ D, with t referred to as the time coordinate and with y standing for the (1D) spatial coordinate. In this language, the basic Arak process is constructed as follows.

Binary Image Segmentation Using Polygonal Markov Fields

385

Birth events. The new particles are born in pairs at birth sites chosen in the interior of the domain D according to a homogeneous Poisson point process of intensity π. There are also boundary birth sites emitting single particles, generated by a Poisson point process with an appropriate intensity measure κ(·) concentrated on ∂D, whose analytic details we omit here for simplicity of presentation, referring the reader to [2] for an exhaustive description. * Each interior birth site emits two particles, moving with initial velocities v and v chosen according to the joint distribution θ(dv , dv ) := π −1 |v − 2 2 v |(1+v )−3/2 (1+v )−3/2 dv dv on v < v . This is equivalent to choosing the angle φ ∈ (0, π) between the straight lines representing the space-time trajectories of the emitted particles according to the density sin(φ)/2. * Each boundary birth site x ∈ ∂D yields one particle with initial speed v determined according to an appropriate distribution θx (dv), see [2] for its explicit form. The colour in the interior of the newly created angle is chosen so as not to clash with the one to the left of the trajectory (the past in time-space terms), with minor modiﬁcation for left-extreme points. Evolution rules. All the particles evolve independently in time according to the following rules. * Between the critical moments listed below each particle moves freely with constant velocity, hence dy = vdt. * When a particle touches the boundary ∂D, it dies. * In case of a collision of two particles (equal spatial coordinates y at some moment t with (t, y) ∈ D), both of them die. * The time evolution of the velocity vt of an individual particle is given by a pure-jump Markov process so that P(vt+dt ∈ du | vt = v) = q(v, du)dt for the transition kernel q(v, du) := |u − v|(1 + u2 )−3/2 du. The random polygonal conﬁguration obtained in the above procedure is precisely the basic Arak process in D. As shown in [2], AD enjoys a number of striking properties. One of them is the two-dimensional germ Markov property, stating that the conditional distribution of the ﬁeld inside an open bounded region with piecewise smooth boundary given the outside conﬁguration depends only on the trace of this conﬁguration on the boundary (colouring, intersection points and intersection directions). The next important property is consistency: for bounded open and convex D1 and D2 with D1 ⊆ D2 the restriction of AD2 to D1 coincides in distribution with AD1 , see Theorem 4.1 of [2]. A crucial property is the isometry invariance of the Arak process - while the translational invariance can be easily deduced from the construction above, the rotational invariance is a deep and non-trivial result that follows from the particular choice of the kernels θ· (·) and q(·, ·) above. Moreover, one dimensional sections of AD happen to be homogeneous Poisson point processes. Finally, a number of explicit formulae are available for various numeric characteristics of AD , such as the expected total edge length, mean number of vertices, edges etc, see [2,4,7]. These properties suggest that the Arak process is a suitable reference ﬁeld.

386

4

R. Kluszczy´ nski, M.-C. van Lieshout, and T. Schreiber

Model-Based Image Segmentation

The model to be used for inference will be a Gibbsian modiﬁcation of the polygonal random ﬁeld AD by means of a Hamiltonian (sometimes referred to as energy function) H(¯ y ; γˆ ) = H(¯ y | γˆ ) + H(ˆ γ) (1) that is the sum of two terms, cf. Section 2. In other words, upon observation of y¯, the likelihood of γˆ with respect to the reference ﬁeld is weighted by a factor exp [−H(¯ y; γˆ )], then normalised to have total probability mass 1. We take H(¯ y| y , γˆ ), the sum of absolute diﬀerences between the actual pixel values γˆ) = L1 (¯ and those assigned by γˆ . For binary images, the L1 distance reduces to |¯ y ∆ˆ γ |, the cardinality of the set of sites at which the observed colour does not match that of the polygon, which can be interpreted probabilistically as a random noise model in which each pixel value is ﬂipped to the wrong colour independently of other pixels with some probability p < 1/2 (see [5,9] for details). Thus, ﬁnding an optimal γˆ∗ in the absence of further regularisation would amount to minimising the misclassiﬁcation rate |¯ y ∆ˆ γ |/|S|. In general, γˆ ∗ is not unique. Moreover, it tends to result in an over-segmentation. To overcome these problems, we added a regularisation term H(ˆ γ ) = β (γ) proportional to the total edge length. We are now ready to rephrase image segmentation as the task of ﬁnding a coloured polygonal conﬁguration γˆ and a real β so that the Hamiltonian value in (1) is a suﬃciently good approximation of the global minimum.

5

Disagreement Loop Birth and Death Dynamics

To minimise (1), we use simulated annealing. Brieﬂy, given a sequence of temperatures Tn decreasing to zero as n → ∞, we sample from a probability density proportional to exp[−H(¯ y ; γˆ )/Tn ] with respect to the reference ﬁeld. Clearly it suﬃces to describe how to sample for Tn = 1. A crucial concept in our algorithm is that of a disagreement loop [11, Section 2.2]. Consider adding a new birth site x0 to conﬁguration γ, and denote the resulting polygonal conﬁguration by γ ⊕x0 . Then, for x0 ∈ Int D, the symmetric diﬀerence ∆⊕ [x0 ; γ] := γ[γ ⊕ x0 ] is just a single loop (a closed polygonal curve), possibly self-intersecting and possibly chopped oﬀ by the boundary. Likewise, a disagreement loop ∆ [x0 ; γ] arises by removing a birth site x0 in polygonal conﬁguration γ; we write γ x0 for the resulting conﬁguration. This is discussed in details in [11,12], here we only give an illustration in Fig. 1. Recall that κ(·) is the boundary birth intensity measure as in Section 3. Then, (DL:birth) with intensity [πdx + κ(dx)]ds set δ := γs ∆⊕ [x; γs ] = γs ⊕ x. Choose either of the two admissible colourings with probability 1/2 to obtain ˆ ˆ ˆ δ. Then, with probability min 1, exp H(¯ y ; γˆs ) − H(¯ y ; δ) put γˆs+ds := δ, otherwise keep γˆs+ds := γˆs ;

Binary Image Segmentation Using Polygonal Markov Fields

387

Fig. 1. Disagreement loop

(DL:death) for each birth site x in γs , with intensity 1 · ds set δ := γs ∆ [x; γs ] = γs x. Choose either of the two admissible colourings with probabilˆ Then, with probability min (1, exp [H(¯ ity 1/2 to obtain δ. y ; γˆs ) − H(¯ y ; δ))]) ˆ put γˆs+ds := δ, otherwise keep γˆs+ds := γˆs . By Theorem 1 in [12] and Theorems 1–2 in [9], under these dynamics the current polygonal conﬁguration converges in distribution to the Gibbsian modiﬁcation of AD with the Hamiltonian H(¯ y; γˆ ). This algorithm exploits the fact that disagreement loops ∆⊕, [x; γ] are very well suited for simulation. Since both the dynamic representation of the Arak process in Section 3 and the simulation algorithm presented here involve a notion of time, in the sequel we refer to the former as to the representation time (r-time) and to the latter as to the simulation time (s-time). Extensions of the algorithm. In order to increase the eﬃciency of the basic algorithm we endow it with a number of extra Monte Carlo moves. These include 1. Random rotations of the spatial and time axes in the dynamic representation of the Arak process. While enlarging the set of possible moves, this does not alter the stationary distribution of the process due to the isometry invariance of AD . 2. Repetitive velocity updates: a particle undergoing a velocity update in rtime, is allowed to ‘change its mind’ in the course of s-time and to randomly modify the previously performed update. This can also be done so that the stationary distribution is unaltered. 3. Rescaling, to guarantee better resolution at later stages of annealing.

6

Implementation and Examples

The algorithm was implemented in C++. General features are discussed brieﬂy below. Representation of polygonal conﬁgurations. A conﬁguration of a polygonal Markov ﬁeld is represented as a list of labelled vertices. The full description of a vertex is provided by – the Cartesian coordinates of the vertex; – two pointers to the neighbouring vertices;

388

R. Kluszczy´ nski, M.-C. van Lieshout, and T. Schreiber

– the virtual lengths of the segments that emanate from the given vertex; these are the lengths the segments would have if the corresponding particles were the only ones present in the system and evolved in an empty environment. The actual lengths of these segments are usually smaller due to collisions. The list of vertices is sorted by increasing x-coordinate (r-time coordinate). Generation of the initial conﬁguration. The initial conﬁguration for our MCMC procedure is generated according to the dynamic representation of the basic Arak process as discussed in Section 3. This is done in a single left-to-right sweep through the image domain, by successively updating in r-time two priority queues that store respectively – the birth sites with r-time coordinate exceeding the current r-time, and virtual end points (with the distance from the respective initial point given by the corresponding virtual length) of segments generated so far, for which the r-time coordinate exceeds the current r-time; – virtual collision points which are all possible pairwise intersection points of currently existing virtual segments (i.e. segments joining an initial point to its corresponding virtual end point) lying forward in r-time. At each step of the algorithm, the next vertex to arise in the course of the rtime evolution is determined by choosing that vertex that minimises the r-time coordinate in both queues. Consequently, the contents of these queues can be regarded as a current collection of ‘candidates’ for the next vertex. Updates. Our main conﬁguration-modifying operations are adding a new birth site (disagreement loop birth (DL:birth)) and removing an existing one (disagreement loop death (DL:death)). To add a new birth site, we ﬁrst choose its position uniformly at random within the image domain, and then we let the newborn particles evolve and interact with the existing ones according to the usual evolution rules. Likewise, when removing a given birth site, we let the remaining particles obey the usual evolution rules. Both these updates are implemented using the same data structures as when generating the initial conﬁguration above. Evaluation of the Hamiltonian. For binary images y¯, H(¯ y | γˆ ) requires the evaluation of the number of misclassiﬁed pixels upon each update proposal. To this end we apply the divergence theorem, constructing a real-valued vector ﬁeld such that the input image data y¯ coincides with its divergence, and then computing appropriate ﬂux integrals along the suitably oriented contours of the polygonal ﬁeld. For grey level images, we resort to Monte Carlo sampling to calculate the L1 distance between y¯ and γˆ. Examples. Here we present a few typical segmentations obtained by the approach. The data in the ﬁrst example consist of a spray-traced image of a happy face. For the cooling schedule we used 1/Tn := 20.0 + 0.009 ∗ n. The result after approximately 1 150 000 steps is given in Figure 2a. The misclassiﬁcation rate we achieved was 3 percent, as visualised in the corresponding graph [Fig.

Binary Image Segmentation Using Polygonal Markov Fields

(a) segmented ‘happy face’

389

(b) misclassiﬁcation rate graph Fig. 2

(a) segmented letter A

(b) segmented letter B

(c) segmented eagle

Fig. 3

2b]. Average execution time for a single iterate was 0.0061 second for 2 CPUs architecture with Intel Xeon 2.4 GHz processors and 2 GB RAM. The data of the second and third image are given by the ﬁrst letters of the alphabet, ‘A’ and ‘B’. The results under the same cooling regime are given in Figures 3a and 3b, and were achieved after 800 000 and 2 million iterations respectively. The misclassiﬁcation rate is 3 percent in both cases. Finally, Fig. 3c presents a sample segmentation of a grey level eagle image from the Berkeley segmentation dataset and benchmark site. The misclassiﬁcation rate reached after 860 000 iterations is again 3 percent.

7

Discussion and Future Work

Here, for brevity, we restricted ourselves to foreground/background segmentation. However, the general framework can easily be extended to allow for segmentation into k > 2 classes of grey level, colour, or textured images [9]. Suitable consistent polygonal ﬁeld models are available with more ﬂexibility in the type of intersections [4], but care is needed with respect to the Hamiltonian. Indeed, it is the object of our current work to develop software for such more complicated cases, and to evaluate the performance of our algorithm on benchmark data available at: http://www.cs.berkeley.edu/projects/vision/grouping/segbench/ or http://mosaic.utia.cas.cz.

390

R. Kluszczy´ nski, M.-C. van Lieshout, and T. Schreiber

A preliminary conclusion is that, in contrast to model-based probabilistic image segmentation at the pixel level, the topology of the foreground object is preserved better. Our approach is also relatively robust with respect to noise. The price to pay is that ﬁne details are not recovered, especially those whose sizes fall below the characteristic scale of the polygonal ﬁeld. This problem could easily be solved in a post-processing smoothing phase. An alternative could be to gradually decrease the characteristic scale of the ﬁeld (multi-resolution approach) or to build local updates in the spirit of [7] into the algorithm.

Acknowledgements This research was supported by the EC 6th Framework Programme Priority 2 Information Society Technology Network of Excellence MUSCLE (Multimedia Understanding through Semantics, Computation and Learning; FP6-507752) and by the Foundation for Polish Science (FNP) and the Polish Minister of Scientiﬁc Research and Information Technology grant 1 P03A 018 28 (2005-2007).

References 1. Arak, T. (1982) On Markovian random ﬁelds with ﬁnite number of values. In 4th USSR-Japan Symposium on Probability Theory and Mathematical Statistics, Abstracts of Communications, Tbilisi. 2. Arak, T. and Surgailis, D. (1989) Markov ﬁelds with polygonal realisations. Probability Theory and Related Fields 80, 543–579. 3. Arak, T. and Surgailis, D. (1991) Consistent polygonal ﬁelds. Probability Theory and Related Fields 89, 319–346. 4. Arak, T., Clifford, P. and Surgailis, D. (1993) Point-based polygonal models for random graphs. Advances in Applied Probability 25, 348–372. 5. Baddeley, A.J. and van Lieshout, M.N.M. (1992) ICM for object recognition. In Computational Statistics, Y. Dodge and J. Whittaker (Eds.), volume 2, pp. 271–286. Physica/Springer, Heidelberg. 6. Clifford, P. and Middleton, R.D. (1989) Reconstruction of polygonal images. Journal of Applied Statistics 16, 409–422. 7. Clifford, P. and Nicholls, G.K. (1994) A Metropolis sampler for polygonal image reconstruction. Electronic version available at: http://www.stats.ox.ac.uk/∼clifford/papers/met poly.html. 8. Hurn, M.A., Husby, O. and Rue, H. (2003) Advances in Bayesian image analysis. In Highly Structured Stochastic Systems, P.J. Green, S. Richardson and N.L. Hjort (Eds.), Oxford Statistical Science Series 27, 323–325. Oxford University Press, Oxford. ´ ski, R., van Lieshout, M.N.M. and Schreiber, T. (2004) Image 9. Kluszczyn segmentation by polygonal Markov ﬁelds. Electronic version available as CWI Research Report PNA-R0409 at: http://www.cwi.nl/publications. 10. Rosenfeld, A. and Kak, A.C. (1982) Digital picture processing, second edition, volume 2. Academic Press, Orlando. 11. Schreiber, T. (2003) Mixing properties of polygonal Markov ﬁelds in the plane. Electronic version available at: http://www.mat.uni.torun.pl/preprints, 18-2003. 12. Schreiber, T. (2004) Random dynamics and thermodynamic limits for polygonal Markov ﬁelds in the plane. Electronic version available at: http://www.mat.uni.torun.pl/preprints, 17-2004.

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & Close