Lecture 5: Clustering and Segmentation – Part 1 Professor Fei‐Fei Li Stanford Vision Lab
Fei-Fei Li
Lecture 5 -
1
10‐Oct‐11
What we will learn today • Segmentation and grouping – Gestalt principles
• Segmentation as clustering – K‐means – Feature space
• Probabilistic clustering (Problem Set 1 (Q3)) – Mixture of Gaussians, EM
Fei-Fei Li
Lecture 5 -
2
10‐Oct‐11
Fei-Fei Li
Lecture 5 -
3
10‐Oct‐11
Image Segmentation • Goal: identify groups of pixels that go together
Slide credit: Steve Seitz, Kristen Grauman
Fei-Fei Li
Lecture 5 -
4
10‐Oct‐11
The Goals of Segmentation • Separate image into coherent “objects” Image
Human segmentation
Slide credit: Svetlana Lazebnik
Fei-Fei Li
Lecture 5 -
5
10‐Oct‐11
The Goals of Segmentation
“superpixels”
X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003.
Fei-Fei Li
Lecture 5 -
6
10‐Oct‐11
Slide credit: Svetlana Lazebnik
• Separate image into coherent “objects” • Group together similar‐looking pixels for efficiency of further processing
Segmentation • Compact representation for image data in terms of a set of components • Components share “common” visual properties • Properties can be defined at different level of abstractions
Fei-Fei Li
Lecture 5 -
7
10‐Oct‐11
General ideas This lecture (#5)
• Tokens
– whatever we need to group (pixels, points, surface elements, etc., etc.) • Bottom up segmentation
– tokens belong together because they are locally coherent • Top down segmentation
– tokens belong together because they lie on the same visual entity (object, scene…) > These two are not mutually exclusive Fei-Fei Li
Lecture 5 -
8
10‐Oct‐11
What is Segmentation? • Clustering image elements that “belong together” – Partitioning • Divide into regions/sequences with coherent internal properties – Grouping • Identify sets of coherent tokens in image
Slide credit: Christopher Rasmussen
Fei-Fei Li
Lecture 5 -
9
10‐Oct‐11
What is Segmentation?
Why do these tokens belong together?
Fei-Fei Li
Lecture 5 -
10
10‐Oct‐11
Basic ideas of grouping in human vision
• Gestalt properties • Figure‐ground discrimination
Fei-Fei Li
Lecture 5 -
11
10‐Oct‐11
Examples of Grouping in Vision
Grouping video frames into shots Determining image regions
What things should be grouped?
Figure‐ground
What cues indicate groups? Slide credit: Kristen Grauman
Fei-Fei Li
Object‐level grouping Lecture 5 -
12
10‐Oct‐11
Similarity
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
13
10‐Oct‐11
Symmetry
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
14
10‐Oct‐11
Common Fate
Image credit: Arthus‐Bertrand (via F. Durand)
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
15
10‐Oct‐11
Proximity
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
16
10‐Oct‐11
Muller‐Lyer Illusion
• Gestalt principle: grouping is key to visual perception. Fei-Fei Li
Lecture 5 -
17
10‐Oct‐11
The Gestalt School • Grouping is key to visual perception • Elements in a collection can have properties that result from relationships – “The whole is greater than the sum of its parts” Illusory/subjective contours
Familiar configuration http://en.wikipedia.org/wiki/Gestalt_psychology Fei-Fei Li
Lecture 5 -
18
10‐Oct‐11
Slide credit: Svetlana Lazebnik
Occlusion
Gestalt Theory • Gestalt: whole or group – Whole is greater than sum of its parts – Relationships among parts can yield new properties/features
• Psychologists identified series of factors that predispose set of elements to be grouped (by human visual system) “I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees.”
Max Wertheimer (1880-1943)
Untersuchungen zur Lehre von der Gestalt, Psychologische Forschung, Vol. 4, pp. 301-350, 1923 http://psy.ed.asu.edu/~classics/Wertheimer/Forms/forms.htm
Fei-Fei Li
Lecture 5 -
19
10‐Oct‐11
•
These factors make intuitive sense, but are very difficult to translate into algorithms.
Fei-Fei Li
Lecture 5 -
20
10‐Oct‐11
Image source: Forsyth & Ponce
Gestalt Factors
Continuity through Occlusion Cues
Fei-Fei Li
Lecture 5 -
21
10‐Oct‐11
Continuity through Occlusion Cues
Continuity, explanation by occlusion Fei-Fei Li
Lecture 5 -
22
10‐Oct‐11
Image source: Forsyth & Ponce
Continuity through Occlusion Cues
Fei-Fei Li
Lecture 5 -
23
10‐Oct‐11
Image source: Forsyth & Ponce
Continuity through Occlusion Cues
Fei-Fei Li
Lecture 5 -
24
10‐Oct‐11
Figure‐Ground Discrimination
Fei-Fei Li
Lecture 5 -
25
10‐Oct‐11
The Ultimate Gestalt?
Fei-Fei Li
Lecture 5 -
26
10‐Oct‐11
What we will learn today • Segmentation and grouping – Gestalt principles
• Segmentation as clustering – K‐means – Feature space
• Probabilistic clustering – Mixture of Gaussians, EM
• Model‐free clustering – Mean‐shift
Fei-Fei Li
Lecture 5 -
27
10‐Oct‐11
Image Segmentation: Toy Example 3 1
white pixels black pixels
gray pixels
2
input image intensity
• These intensities define the three groups. • We could label every pixel in the image according to which of these primary intensities it is. – i.e., segment the image based on the intensity feature.
• What if the image isn’t quite so simple? Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
28
10‐Oct‐11
Pixel count Input image
Pixel count
Intensity
Input image Intensity
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
29
10‐Oct‐11
Pixel count Input image Intensity
• Now how to determine the three main intensities that define our groups? • We need to cluster. Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
30
10‐Oct‐11
0
190 Intensity
255
3 2
• Goal: choose three “centers” as the representative intensities, and label every pixel according to which of these centers it is nearest to. • Best cluster centers are those that minimize Sum of Square Distance (SSD) between all points and their nearest cluster center ci: ∈
Fei-Fei Li
Lecture 5 -
31
10‐Oct‐11
Slide credit: Kristen Grauman
1
Clustering • With this objective, it is a “chicken and egg” problem:
– If we knew the group memberships, we could get the centers by computing the mean per group.
Fei-Fei Li
Lecture 5 -
32
10‐Oct‐11
Slide credit: Kristen Grauman
– If we knew the cluster centers, we could allocate points to groups by assigning each to its closest center.
K‐Means Clustering • Basic idea: randomly initialize the k cluster centers, and iterate between the two steps we just saw. 2.
Randomly initialize the cluster centers, c1, ..., cK Given cluster centers, determine points in each cluster •
3.
Given points in each cluster, solve for ci •
4.
For each point p, find the closest ci. Put p into cluster i Set ci to be the mean of points in cluster i
If ci have changed, repeat Step 2
• Properties – Will always converge to some solution – Can be a “local minimum” •
Does not always find the global minimum of objective function:
∈
Fei-Fei Li
Lecture 5 -
33
10‐Oct‐11
Slide credit: Steve Seitz
1.
Segmentation as Clustering K=2
img_as_col = double(im(:)); cluster_membs = kmeans(img_as_col, K);
K=3
labelim = zeros(size(im)); for i=1:k inds = find(cluster_membs==i); meanval = mean(img_as_column(inds)); labelim(inds) = meanval; end Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
34
10‐Oct‐11
K‐Means Clustering
• Java demo: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html
Fei-Fei Li
Lecture 5 -
35
10‐Oct‐11
K‐Means++ • Can we prevent arbitrarily bad local minima? 1. Randomly choose first center. 2. Pick new center with prob. proportional to – (Contribution of p to total error)
Slide credit: Steve Seitz
3. Repeat until k centers. • Expected error = O(log k) * optimal Arthur & Vassilvitskii 2007 Fei-Fei Li
Lecture 5 -
36
10‐Oct‐11
Feature Space • Depending on what we choose as the feature space, we can group pixels in different ways. • Grouping pixels based on intensity similarity
• Feature space: intensity value (1D) Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
37
10‐Oct‐11
Feature Space • Depending on what we choose as the feature space, we can group pixels in different ways.
R=255 G=200 B=250
• Grouping pixels based on color similarity B
R=245 G=220 B=248
G
R
• Feature space: color value (3D)
R=15 G=189 B=2
R=3 G=12 B=2
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
38
10‐Oct‐11
Feature Space • Depending on what we choose as the feature space, we can group pixels in different ways. • Grouping pixels based on texture similarity F1 F2 Filter bank of 24 filters
… F24
• Feature space: filter bank responses (e.g., 24D) Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
39
10‐Oct‐11
Smoothing Out Cluster Assignments • Assigning a cluster label per pixel may yield outliers:
Original
Labeled by cluster center’s intensity
? • How can we ensure they are spatially smooth?
3 1
2
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
40
10‐Oct‐11
Segmentation as Clustering • Depending on what we choose as the feature space, we can group pixels in different ways. • Grouping pixels based on intensity+position similarity Intensity Y
X
Way to encode both similarity and proximity. Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
41
10‐Oct‐11
K‐Means Clustering Results • K‐means clustering based on intensity or color is essentially vector quantization of the image attributes – Clusters don’t have to be spatially coherent Intensity‐based clusters
Color‐based clusters
Image source: Forsyth & Ponce
Image
Fei-Fei Li
Lecture 5 -
42
10‐Oct‐11
K‐Means Clustering Results • K‐means clustering based on intensity or color is essentially vector quantization of the image attributes • Clustering based on (r,g,b,x,y) values enforces more spatial coherence
Fei-Fei Li
Lecture 5 -
43
10‐Oct‐11
Image source: Forsyth & Ponce
– Clusters don’t have to be spatially coherent
Summary K‐Means • Pros – Simple, fast to compute – Converges to local minimum of within‐cluster squared error
• Cons/issues – – – – –
Setting k? Sensitive to initial centers Sensitive to outliers Detects spherical clusters only Assuming means can be computed
Slide credit: Kristen Grauman
Fei-Fei Li
Lecture 5 -
44
10‐Oct‐11
What we will learn today • Segmentation and grouping – Gestalt principles
• Segmentation as clustering – K‐means – Feature space
• Probabilistic clustering (Problem Set 1 (Q3)) – Mixture of Gaussians, EM
Fei-Fei Li
Lecture 5 -
45
10‐Oct‐11
Probabilistic Clustering • Basic questions – What’s the probability that a point x is in cluster m? – What’s the shape of each cluster?
• K‐means doesn’t answer these questions. • Basic idea – Instead of treating the data as a bunch of points, assume that they are all generated by sampling a continuous function. – This function is called a generative model. – Defined by a vector of parameters θ Slide credit: Steve Seitz
Fei-Fei Li
Lecture 5 -
46
10‐Oct‐11
Mixture of Gaussians
•
One generative model is a mixture of Gaussians (MoG) Slide credit: Steve Seitz
– K Gaussian blobs with means μb covariance matrices Vb, dimension d • Blob b defined by: – Blob b is selected with probability – The likelihood of observing x is a weighted mixture of Gaussians ,
Fei-Fei Li
Lecture 5 -
47
10‐Oct‐11
Expectation Maximization (EM)
Goal –
•
Find blob parameters θ that maximize the likelihood function:
Approach: 1. 2.
3.
E‐step: given current guess of blobs, compute ownership of each point M‐step: given ownership probabilities, update blobs to maximize likelihood function Repeat until convergence
Fei-Fei Li
Lecture 5 -
48
10‐Oct‐11
Slide credit: Steve Seitz
•
EM Details • E‐step – Compute probability that point x is in blob b, given current guess of θ
• M‐step – Compute probability that blob b is selected (N data points) Slide credit: Steve Seitz
– Mean of blob b
– Covariance of blob b
Fei-Fei Li
Lecture 5 -
49
10‐Oct‐11
Applications of EM • Turns out this is useful for all sorts of problems – – – – –
Any clustering problem Any model estimation problem Missing data problems Finding outliers Segmentation problems
– ...
• EM demo – http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html
Fei-Fei Li
Lecture 5 -
50
10‐Oct‐11
Slide credit: Steve Seitz
• Segmentation based on color • Segmentation based on motion • Foreground/background separation
Segmentation with EM Original image
EM segmentation results
k=2
k=3
k=4
k=5
Image source: Serge Belongie
Fei-Fei Li
Lecture 5 -
51
10‐Oct‐11
Summary: Mixtures of Gaussians, EM • Pros – – – –
Probabilistic interpretation Soft assignments between data points and clusters Generative model, can predict novel data points Relatively compact storage
• Cons – Local minima – Initialization • Often a good idea to start with some k‐means iterations.
– Need to know number of components • Solutions: model selection (AIC, BIC), Dirichlet process mixture
– Need to choose generative model – Numerical problems are often a nuisance
Fei-Fei Li
Lecture 5 -
52
10‐Oct‐11
What we have learned today • Segmentation and grouping – Gestalt principles
• Segmentation as clustering – K‐means – Feature space
• Probabilistic clustering (Problem Set 1 (Q3)) – Mixture of Gaussians, EM
Fei-Fei Li
Lecture 5 -
53
10‐Oct‐11