logo

SCIENTIA SINICA Informationis, Volume 47, Issue 12: 1674-1693(2017) https://doi.org/10.1360/N112016-00156

Hierarchical complexity measures for effective shape-based image retrieval

Bin WANG1,2,*
More info
  • ReceivedDec 31, 2016
  • AcceptedMar 13, 2017
  • PublishedAug 30, 2017

Abstract

A novel descriptor, called HCMD, which is based on hierarchical complexity measures, is proposed for generic shape based image retrieval. HCMD belongs to region-based methods and iteratively partitions a shape into smaller blocks along various directions. The geometrical properties of these smaller blocks, which are derived from each iterative cut, are measured to form a hierarchical description for the shape. The descriptor has the ability to characterize a shape from coarse to fine, and can effectively capture its complex inner structural features. Shape matching based on HCMD is independent of the rotation, scaling, translation, and mirror transform of the shape. It has low computational complexity and can effectively handle both the contour and region shapes. Three standard test sets, namely, the MPEG-7 CE-1 contour shape database, MPEG-7 CE-2 region shape database, and COIL-20 database, are used to evaluate the performance of the proposed HCMD, and extensive comparisons with state-of-the-art approaches, including five region-based descriptors, four point-set based descriptors, and two curve-based descriptors, are conducted. All experimental results indicate that the proposed HCMD outperforms these approaches in terms of their comprehensive performance based on the retrieval rates, retrieval efficiency, and general applications.


Funded by

国家自然科学基金(61372158)

江苏省自然科学基金(BK20141487)

江苏省“333工程高层次人才工程(BRA2015351)

江苏省科技计划(产学研合作前瞻性联合研究)(BY2016009-03)

江苏高校优势学科建设工程(PA- PD)


References

[1] Loncaric S. A survey of shape analysis techniques. Pattern Recog, 1998, 31: 983--1001. Google Scholar

[2] Torres R D S, Falc$\tilde{\rm~~a}$o A X, Costa L D F. A graph-based approach for multiscale shape analysis. Pattern Recog, 2003, 37: 1163--1174. Google Scholar

[3] Costa L D F, Cesar R M. Shape Analysis and Classification: Theory and Practice. Boca Raton: CRC Press LLC, 2001. Google Scholar

[4] Wallace T P, Wintz P A. An efficient three-dimensional aircraft recognition algorithm using Fourier descriptors. Comput Graph Image Process, 1980, 13: 99--126. Google Scholar

[5] Persoon E, Fu K. Shape discrimination using Fourier descriptors. IEEE Trans Pattern Anal Machine Intell, 1986, 3: 388--397. Google Scholar

[6] Mokhtarian F, Abbasi S, Kittler J. Efficient and robust retrieval by shape content through curvature scale space. In: Image Databases and Multimedia Search. Singapore: World Scientific Publication, 1997. 51--58. Google Scholar

[7] Mokhtarian F, Bober M. Curvature Scale Space Representation: Theory, Application, and MPEG-7 Standardization. Dordrecht, Boston, and London: Kluwer Academic Publishers, 2003. Google Scholar

[8] Felzenszwalb P F, Schwartz J. Hierarchical matching of deformable shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, 2007. 1--8. Google Scholar

[9] Adamek T, ${\rm~~O}^{}$Connor N E. A multiscale representation method for nonrigid shapes with a single closed contour. IEEE Trans Circuits Syst Video Tech, 2004, 14: 742--753. Google Scholar

[10] Ling H, Jacobs D W. Shape classification using the inner-distance. IEEE Trans Pattern Anal Machine Intell, 2007, 29: 286--299. Google Scholar

[11] Alajlan N, Rube I E, Kamel M S, et al. Shape retrieval using triangle-area representation and dynamic space warping. Pattern Recog, 2007, 19: 1911--1920. Google Scholar

[12] Xu C, Liu J, Tang X. 2D shape matching by contour flexibility. IEEE Trans Pattern Anal Machine Intell, 2009, 31: 180--186. Google Scholar

[13] Xiong G Q, Qi D X, Guo F H. A class of orthonormal complete piecewise polygonal systems and applications thereof. Sci Sin Inform, 2012, 42: 70--82. Google Scholar

[14] Wang Z, Liang M. Locally affine invariant descriptors for shape matching and retrieval. IEEE Signal Process Lett, 2010, 17: 803--806. Google Scholar

[15] Biswas S, Aggarwal G, Chellappa R. An efficient and robust algorithm for shape indexing and retrieval. IEEE Trans Multimedia, 2010, 12: 372--384. Google Scholar

[16] Bai X, Yang X, Latecki L J, et al. Learning context-sensitive shape similarity by graph transduction. IEEE Trans Pattern Anal Machine Intell, 2010, 32: 861--874. Google Scholar

[17] Wang J, Bai X, You X, et al. Shape matching and classification using height functions. Pattern Recog Lett, 2012, 33: 134--143. Google Scholar

[18] Hu R, Jia W, Ling H, et al. Angular pattern and binary angular pattern for shape retrieval. IEEE Trans Image Process, 2014, 23: 1118--1127. Google Scholar

[19] Wang B, Gao Y. Hierarchical string cuts: a translation, rotation, scale, and mirror invariant descriptor for fast shape retrieval. IEEE Trans Image Process, 2014, 23: 4101--4111. Google Scholar

[20] Daliri M, Torre V. Robust symbolic representation for shape recognition and retrieval. Pattern Recog, 2008, 41: 1799--1815. Google Scholar

[21] Attalla E, Siy P. Robust shape similarity retrieval based on contour segmentation polygonal multiresolution and elastic matching. Pattern Recog, 2005, 38: 2229--2241. Google Scholar

[22] Super B. Retrieval from shape databases using chance probability functions and fixed correspondence. Int J Pattern Recogn Artif Intell, 2006, 20: 1117--1137. Google Scholar

[23] Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Machine Intell, 2002, 24: 509--522. Google Scholar

[24] Grigorescu C, Petkov N. Distance sets for shape filters and shape recognition. IEEE Trans Image Process, 2003, 12: 729--739. Google Scholar

[25] Wang B. Shape recognition using unordered point-set description and matching of object contour. J Softw, 2016, 27: 3131--3142. Google Scholar

[26] Peter A, Rangarajan A, Ho J. Shape L. Google Scholar

[27] Zhang D, Lu G. Review of shape representation and description techniques. Pattern Recog, 2004, 37: 1--19. Google Scholar

[28] Hu M K. Visual pattern recognition by moment invariants. IRE Trans Inf Theory, 1962, IT-8: 179--187. Google Scholar

[29] Khotanzad A, Hong Y H. Invariant image recognition in Zernike moments. IEEE Trans Pattern Anal Machine Intell, 1990, 12: 250--261. Google Scholar

[30] Raveendran P, Omatu S. Performance of an optimal subset of Zernike features for pattern classification. Inf Sci, 1993, 1: 133--147. Google Scholar

[31] Yap P T, Jiang X, Kot A C. Two-dimensional polar harmonic transforms for invariant image representation. IEEE Trans Pattern Anal Machine Intell, 2010, 32: 1259--1270. Google Scholar

[32] Persoon E, Fu K. Shape discrimination using Fourier descriptors. IEEE Trans SMC, 1977, 7: 170--179. Google Scholar

[33] Zhang D S, Lu G J. Shape based image retrieval using generic Fourier descriptor. Signal Process Image Commun, 2002, 17: 825--848. Google Scholar

[34] Deans S R. The Radon Transform and Some of its Applications. New York: John Wiley and Sons, 1983. Google Scholar

[35] Tabbone S, Wendling L, Salmon J P. A new shape descriptor defined on the Radon transform. Comput Vis Image Und, 2006, 102: 42--51. Google Scholar

[36] Hoang T V, Tabbone S. The generalization of the R-transform for invariant pattern representation. Pattern Recog, 2012, 45: 2145--2163. Google Scholar

[37] Chen Y W, Chen Y Q. Invariant description and retrieval of planar shapes using Radon composite features. IEEE Trans Signal Process, 2008, 56: 4762--4771. Google Scholar

[38] Yang M, Qiu G, Huang Y, et al. Near-duplicate image recognition and content-based image retrieval using adaptive geometric centroids. In: Proceedings of the 18th International Conference on Pattern Recognition, Hong Kong, 2006. 958--961. Google Scholar

[39] Sidiropoulos P, Vrochidis S, Kompatsiaris I. Content-based binary image retrieval using the adaptive hierarchical density histogram. Pattern Recog, 2011, 44: 739--750. Google Scholar

[40] Nene S A, Nayar S K, Murase H. Columbia Object Image Library (COIL-20). Department of Computer Science, Technical Report, Columbia University CUCS-005-96. 1996. Google Scholar

  • Figure 1

    (a) Five samples from the MPEG-7 CE-1 contour shape database and (b) the edge points extracted from them

  • Figure 2

    (a) Five samples from the MPEG-7 CE-2 region shape database and (b) the edge points extracted from them

  • Figure 3

    Classification of shape descriptors

  • Figure 4

    (a) A shape image distorted by salt pepper noise and (b) the contour extracted from it

  • Figure 5

    An example of hierarchical description framework. In the figure, a shape being rotated by angle $\theta$ counterclockwise is partitioned into five levels of blocks ($k=0,1,2,3,4$) by four iterative cuts, where the cut line is marked by grey color and the centroid of the current region block is marked by white dot

  • Figure 6

    An example of the complexity measures for a shape region block. Left figure: the measured shape region block (the grey line is the cut line crossing the centre of the shape region block and the rectangle shown in dash line is the bounding box of the shape region block; Right figure (upper): measuring the rectangle degree (Eq. (11)); Right figure (bottom): measuring the balance degree (Eq. (12)), the numbers (right-most) are the values of the complexities of the region blocks

  • Figure 7

    MPEG-7 CE-1 image dataset. (a) Samples from the dataset; (b) 20 samples from a class in the dataset

  • Figure 8

    MPEG-7 CE-2 image dataset. (a) Samples from the dataset; (b) 21 samples from a class in the dataset, where the shape images shown from row 2 to row 5 are the rotated, scaled and perspective transformed versions of the sample shown in the first row, respectively

  • Figure 9

    (a) 20 object samples from the COIL-20 database; (b) the binary images derived from the images shown in (a)

  • Figure 10

    (a) Example views of one object from the COIL-20 database; (b) the binary images derived from the images shown in (a)

  • Figure 11

    The curve of the recogniton error rate $E$ vs. the number $M$ of the uniformly selected views per object for training using the COIL-20 database

  • Figure 12

    An original shape sample (the left figure) and its versions distorted by salt and pepper noise (the right four figures with noise intensity of 0.2, 0.4, 0.6 and 0.8, respectively)

  • Table 1   Retrieval rates on the MPEG-7 CE-1 shape dataset $^{\rm~a)}$
    Algorithm Bulls-eye test score (%)
    Curve based Shape Tree [8] 87.70$^{\dag}$
    Locally affine invariant descriptors [14] 89.62$^{\dag}$
    Shape Contexts [23] 76.51$^{\dag}$
    Point-set based Distance Set [24] 78.38$^{\dag}$
    Shape L$$${\rm~\hat{A}}$ne Rouge [26] 85.25$^{\dag}$
    Unordered point-set description(UPSD) [25] 78.18
    Zernike Moments [29] 68.90
    Generic Fourier descriptors (GFD) [33] 67.60
    Region based Radon composite features (RCF) [37] 67.30
    Adaptive hierarchical density histogram (AHDH) [39] 63.95$^{\dag}$
    Polar harmonic transforms (PHTs) [31] 70.92
    The proposed HCMD 86.36
    a) The scores marked by $\dag$ are directly cited from the published results.
  •   

    Algorithm 1 Calculating hierarchical complexity measures descriptor (HCMD)

    Require: $(f_{x,y})_{G\times~H}$: the matrix of binary shape image; $K$: the number of hierarchical levels; $T$: the number of sampling angles from range $[0,2\pi]$;

    Output: $(\xi_{k,t})_{K\times~T}$, $(\zeta_{k,t})_{K\times~T}$, $(\eta_{k,t})_{K\times~T}$, $(\mu_{k,t})_{K\times~T}$: HCMD of shape $f$;

    Initialize $(\xi_{k,t})_{K\times~T}\leftarrow~0$ and $(\zeta_{k,t})_{K\times~T}\leftarrow~0$;

    $B\leftarrow~\{(x,y)|f_{x,y}=1\}$;

    $S=|B|$;

    $\overline{x}\leftarrow\frac{1}{S}\sum\nolimits_{(x,y)\in~B}x$; $\overline{y}\leftarrow\frac{1}{S}\sum\nolimits_{(x,y)\in~B}y$;

    $\overline{B}\leftarrow~\{(x-\overline{x},y-\overline{y})|(x,y)\in~B\}$;

    $\theta\leftarrow2\pi/T$;

    for $t=1$ to $T$

    $D\leftarrow\overline{B}$;

    for $k=1$ to $K$

    if $S>0$ then

    $D_x\leftarrow\{x\mid~(x,y)\in~D\}$;$D_y\leftarrow\{y\mid~(x,y)\in~D\}$;

    $x_r\leftarrow\max(D_x)$; $x_l\leftarrow\min(D_x)$; $y_r\leftarrow\max(D_y)$; $y_l\leftarrow\min(D_y)$;

    $\xi_{k,t}\leftarrow~S/((x_r-x_l+1)(y_r-y_l+1))$;

    $x_c\leftarrow\frac{1}{S}\sum\nolimits_{(x,y)\in~D}x$;

    $D_c\leftarrow\{(x,y)\mid~x<=x_c\bigwedge~(x,y)\in~D\}$;

    $S_L=|D_c|$;

    $\zeta_{k,t}\leftarrow~S_L/S$;

    $\eta_{k,t}=(x_r-x_l)/(y_r-y_l+1)$;

    $\mu_{k,t}=(x_r-x_c)/(x_c-x_l+1)$;

    $D\leftarrow~D_c$; $S\leftarrow~S_L$;

    else

    break;

    end if

    end for

    $\overline{B}\leftarrow~\{(x\cos\theta-y\sin\theta,y\cos\theta+x\sin\theta)|(x,y)\in~\overline{B}\}$;

    end for

    for $k=1$ to $K$

    $\eta_{k,1:T}\leftarrow\eta_{k,1:T}/{\rm~max}(\eta_{k,1:T})$;

    $\mu_{k,1:T}\leftarrow~\mu_{k,1:T}/{\rm~max}(\mu_{k,1:T})$;

    end for

    Return $(\xi_{k,t})_{K\times~T}$, $(\zeta_{k,t})_{K\times~T}$, $(\eta_{k,t})_{K\times~T}$ and $(\mu_{k,t})_{K\times~T}$;

  • Table 2   Retrieval rates on MPEG-7 CE-2 shape dataset
    Algorithm Bulls-eye test score (%)
    Point-set based Shape Contexts [23] 70.98
    Unordered point-set description (UPSD) [25] 84.13
    Zernike Moments [29] 80.20
    Generic Fourier descriptors (GFD) [33] 81.20
    Region based Radon composite features (RCF) [37] 67.40
    Adaptive hierarchical density histogram (AHDH) [39] 49.94
    Polar harmonic transforms (PHTs) [31] 64.13
    The proposed HCMD 94.52

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1