Introduction(1/4) In the last 1215 years the availability of
digital visual information has grown very quickly. Content Based
Image Retrieval (CBIR) is a research area whose aim is the
development of tools for retrieval of visual information using its
perceptual content. 4
Slide 5
Introduction(2/4) In Image Retrieval by Sketch the query is a
stylized sketch drawn by the user in order to specify the shape
features she is interested to find in the images within the systems
database. The issue of inexact matching between the sketch and the
images and the issue of segmentation are the two main problems
which a sketch-based image retrieval system has to deal with.
5
Slide 6
Introduction(3/4) Most of the methods and techniques for shape-
based image retrieval can be classified in three main categories:
statistical techniques deformable template matching multiscale
representations 6
Slide 7
Introduction(4/4) modified the GHT First of all, we spread the
voting result in order to deal with small local deformations
without increasing the whole asymptotic computational space and
time complexity. Moreover, once the most likely position of the
sketch in the image has been localized using the votes in the
accumulator, shape segmentation is further verified. 7
Slide 8
Methods 8
Slide 9
Canny edge detection The first filter aims at deleting edge
pixels surrounded by a disordered and thick texture. The second
filter deals with ordered textures (e.g., a sheaf of parallel
lines). 9
Slide 10
The first filter C(p) : a square mask of n 1 x n 1 pixels
centered at pixel p. N : the number of edge pixels in C(p). (p) :
the gradient direction of a generic edge pixel p we cancel the edge
pixel p from the edge map if: N > 1 2 > 2, where 1 and 2 are
two pre- fixed thresholds.(n 1 = 40, 1 = 260, 2 = 0.165) 10
Slide 11
The second filter Let N be the number of edge pixels p
belonging to the mask D(p) n2 x n2 and such that (p) = (p). We
cancel p if N > 3 (n 2 = 20, 3 = 120). 11
Slide 12
From now on we will denote with I the edge map of the currently
analyzed image of the systems database after the salience filter
application. 12
Deformation tolerant GHT (DTGHT) I S user-drawn sketch Seg I T
R-Table m cardinality of T(m = #T = #S) R-Table if p k is a point
of S, then: T[k] = p r - p k, p r being the centroid of S 16
Slide 17
Deformation tolerant GHT (DTGHT) I (p) and S [k] denote,
respectively, the direction of the point p in I and p k in S. In
order to improve the accuracy, I (p) and S [k] are computed using
adjacent points in the same segment using the following formula:
where ( = 10) is a constant and p j is the jth point in a given
segment s (and analogously for S ). 17
Slide 18
Deformation tolerant GHT (DTGHT) Nevertheless, we do not use S
[k] to index T as in the original GHT. In fact we aim at looking
for a shape S contained in I which is similar but not necessarily
identical to S. Hence, we usually expect that a point p in S and a
corresponding point p in S are quite differently oriented. 18
Slide 19
Voting Procedure We perform a vote operation analogous to the
original GHT voting phase. = /8 Now we have a voting result in
space A. 19
Slide 20
Cluster the Votes in A fixed vote dispersion window W Let W
2l+1x2l+1 be a square mask (l is defined below). W(p) is the set of
all the nonzero cells of A contained in the mask W when its center
is positioned at p. The mass M(p) of W(p), as the sum of the values
of the elements of W(p). The maximum of M(p) corresponds to the
mass of the region with the highest concentration of votes. 20
Slide 21
Compute M(p) M(p) is incrementally built using a technique
similar to the integral image. W i (p) represents the nonzero
elements of the ith column of the mask W(p). 21
Slide 22
Compute M(p) Let now C(x, y) be the cumulative row sum computed
with respect to the yth column of A 22
Slide 23
Compute M(p) If P = arg max pI M(p), then P with a high
probability is the point in I corresponding to the centroid of the
shape most similar to S. Since the deformation tolerance area
delimits the region of the points vary with S P, from the parameter
l it decides the size of the shape details which will be ignored by
the system in the matching process. We set l = d, where d is the
diagonal of I and =0.03( in our trials l = 12, which leads to a
window side of 25 pixels.). 23
Slide 24
Example of systems output 24
Slide 25
Line segment matching S P is the projection of S on I with P
its center of mass Thick textured regions and cluttered backgrounds
can randomly concentrate their votes in a unique point not actually
corresponding to a shape S similar to S. 25
Slide 26
Line segment matching Extraneous vs. Valid Segments A point p
of to I is a valid point if i is a valid hypothesis for p. We call
a segment s i a valid segment if #V i k 1 x #s i, where k 1 = 0.7
and V i is the set of all the valid points of the segment s i.
26
Slide 27
Line segment matching A point p of to I is a nearby point if We
call a segment s i a extraneous segment if s i is not a valid
segment and # N i k 2 x #s i, where k 2 = 0.2 and N i is the set of
all the nearby points of the segment s i. Let V be the subset of
Seg composed of all the valid segments. Let E be the subset of Seg
composed of all the extraneous segments. 27
Slide 28
Matching Test > valid valid , (m true ) 28
Slide 29
Similarity 29
Slide 30
Similarity rank The DTGHT, like the original GHT, is not
rotation nor scale invariant. In the off-line preprocessing of each
database image we produce a pyramidal representation of I composed
of 5 different resolution levels. 30
Slide 31
Similarity rank The final scale invariant similarity estimation
(SISim) between I and S is given by we can suppose the user usually
draws a sketch with its expected orientation (e.g., a horizontal
car or horse, a vertical tree) and thus rotation invariance can
often be ignored in order to speed up the systems performance.
31
Slide 32
Similarity rank 32
Slide 33
Results 33
Slide 34
Computational complexity n is the number of edge pixels of I N
1 = w x h, m = #S, N = #Seg k is the number of scale iterations (N
n,N 1 ) R-table voting phase find max M construction of the sets V
and E Extraneous vs. Valid Segments and the Matching Test. 34
Slide 35
Computational complexity the computational worst case cost of
the original GHT is O(h(nm + N 1 )) with h iterations for different
discrete values of scale. From this comparison we can state that
the DTGHT and the GHT have the same asymptotic worst case behavior.
Moreover the DTGHT needs fewer iterations with respect to the GHT
in order to deal with the same range of scale changes (i.e., k <
h) 35
Slide 36
Experimental results We have implemented our method with non-
optimized Java code and tested it on a Pentium IV, 1.7 GHz. Less
than 2 s, one second on average. Images from 200 x 200 up to 380 x
350 pixels. Include 5 different iterations per image for the 5
corresponding image scale values. Not include the preprocessing.
36
Slide 37
Experimental results The systems database is composed of 283
images randomly taken by the Web. No manual segmentation has been
performed on the images in order to separate the interesting
objects from their background or from other adjacent or occluding
objects. Also lighting conditions and noise degree are not fixed.
37
Slide 38
Experimental results 38
Slide 39
Experimental results Comparison to other approaches 24 DTGHT,
15 do not apply scale iterations,using the objects minimum
enclosing rectangle to set the scale parameters. Kimia dataset.
39
Slide 40
Experimental results Comparison to other approaches we have
obtained the second best result. our system is the only one among
those mentioned in Table 2 which can be reliably applied to images
containing occlusions and non-uniform backgrounds. 40
Slide 41
Experimental results Comparison to other approaches Caltech 101
dataset, composed of real images with significant texture and
clutter. 160 images for a given query was about 140 seconds,
including 5 different scale iterations per image. 41
Slide 42
Conclusion 42
Slide 43
Conclusion DTGHT is an effective technique to deal with the two
main problems in sketch-based image retrieval: image segmentation
and inexact matching. inexact matching can be realized using a
large dispersion vote window and that a dynamic programming
approach makes this process efficient. 43
Slide 44
Conclusion Segmentation is further obtained comparing the
sketch with the candidate image lines. We have also shown how,
differently from most of the existing sketch-based image retrieval
approaches, the DTGHT is able to efficiently deal with images with
cluttered backgrounds. 44