logo

SCIENCE CHINA Information Sciences, Volume 59, Issue 1: 012105(2016) https://doi.org/10.1007/s11432-015-5289-7

Nonnegative correlation coding for image classification

More info
  • ReceivedDec 9, 2014
  • AcceptedJan 4, 2015
  • PublishedMay 18, 2015

Abstract

Feature coding is one of the most important procedures in the bag-of-features model for image classification. In this paper, we propose a novel feature coding method called nonnegative correlation coding. In order to obtain a discriminative image representation, our method employs two correlations: the correlation between features and visual words, and the correlation between the obtained codes. The first correlation reflects the locality of codes, i.e., the visual words close to the local feature are activated more easily than the ones distant. The second correlation characterizes the similarity of codes, and it means that similar local features are likely to have similar codes. Both correlations are modeled under the nonnegative constraint. Based on the Nesterov's gradient projection algorithm, we develop an effective numerical solver to optimize the nonnegative correlation coding problem with guaranteed quadratic convergence. Comprehensive experimental results on publicly available datasets demonstrate the effectiveness of our method.


Acknowledgment

Acknowledgments

This work was supported in part by National Basic Research Program of China (973) (Grant No. 2012CB720000), National Natural Science Foundation of China (NSFC) (Grant No. 61203291), and Specialized Research Fund for the Doctoral Program of Chinese Higher Education (Grant No. 20121101110035). The authors are grateful to Min YANG and Yang HE for helpful discussions.


References

[1] Sivic J, Zisserman A. Video google: A text retrieval approach to object matching in videos. In: Proceedings of the 9th Computer Vision Conference. Nice: IEEE, 2003. 1470--1477. Google Scholar

[2] Huang Y Z, Wu Z F, Wang L, et al. IEEE Trans Pattern Anal Mach Intel, 2013, 36: 493-506 Google Scholar

[3] Yang J, Yu K, Gong Y, et al. Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of the 22th Computer Vision and Pattern Recognition Conference. Miami: IEEE, 2009. 1794--1801. Google Scholar

[4] Zhang C, Liu J, Tian Q, et al. Image classification by non-negative sparse coding, low-rank and sparse decomposition. In: Proceedings of the 24th Computer Vision and Pattern Recognition Conference. Colorado Springs: IEEE, 2011. 1673--1680. Google Scholar

[5] Yu K, Zhang T, Gong Y, et al. Nonlinear learning using local coordinate coding. In: Proceedings of the 24th Advances in Neural Information Processing Systems. Vancouver: NIPS, 2009. 2223--2231. Google Scholar

[6] Gao S, Tsang I, Chia L, et al. IEEE Trans Pattern Anal Mach Intel, 2013, 35: 92-104 Google Scholar

[7] Zheng M, Bu J, Chen C, et al. IEEE Trans Image Process, 2011, 20: 1327-1336 Google Scholar

[8] Liu L Q, Wang L, Liu X W, et al. In defense of soft-assignment coding. In: Proceedings of the 13th Computer Vision Conference. Barcelona: IEEE, 2011. 2486--2493. Google Scholar

[9] Wang J, Yang J, Yu K, et al. Locality-constrained linear coding for image classification. In: Proceedings of the 23th Computer Vision and Pattern Recognition Conference. San Francisco: IEEE, 2010. 3360--3367. Google Scholar

[10] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 19th Computer Vision and Pattern Recognition Conference. San Francisco: IEEE, 2006. 2169--2178. Google Scholar

[11] Nesterov Y, Nesterov I U E. Introductory Lectures on Convex Optimization: A Basic Course. Berlin: Springer, 2004. Google Scholar

[12] van Gemert J C, Veenman C J, Smeulders A W, et al. IEEE Trans Pattern Anal Mach Intel, 2010, 32: 1271-1283 Google Scholar

[13] Liu Y, Wu F, Zhang Z, et al. Sparse representation using nonnegative curds and whey. In: Proceedings of the 23th Computer Vision and Pattern Recognition Conference. San Francisco: IEEE, 2010. 3578--3585. Google Scholar

[14] Gao S, Tsang I W, Chia L- T, et al. Local features are not lonely--Laplacian sparse coding for image classification. In: Proceedings of the 23th Computer Vision and Pattern Recognition Conference. San Francisco: IEEE, 2010. 3555--3561. Google Scholar

[15] Chung F R. Spectral Graph Theory. Washington DC: American Mathematical Society, 1997. Google Scholar

[16] Shabou A, LeBorgne H. Locality-constrained and spatially regularized coding for scene categorization. In: Proceedings of the 25th Computer Vision and Pattern Recognition Conference. Providence: IEEE, 2012. 3618--3625. Google Scholar

[17] Wang Z L, Feng J S, Yan S C, et al. IEEE Trans Image Process, 2013, 22: 537-548 Google Scholar

[18] Zhang T, Ghanem B, Liu S, et al. Low-rank sparse coding for image classification. In: Proceedings of the 14th Computer Vision Conference. Sydney: IEEE, 2013. 281--288. Google Scholar

[19] Zhang L, Ma C. Neurocomputing, 2014, 135: 339-347 Google Scholar

[20] Hoyer P O. Neurocomputing, 2003, 52: 547-552 Google Scholar

[21] Duchi J, Shalev-Shwartz S, Singer Y, et al. Efficient projections onto the l1-ball for learning in high dimensions. In: Proceedings of the 25th Machine Learning Conference. Helsinki: ACM, 2008. 272--279. Google Scholar

[22] Nesterov Y. Soviet Math Dok, 1983, 27: 372-376 Google Scholar

[23] Li L J, Li F F. What, where and who? Classifying events by scene and object recognition. In: Proceedings of the 11th Computer Vision Conference. Rio de Janeiro: IEEE, 2007. 1--8. Google Scholar

[24] Li F F, Fergus R, Perona P, et al. Comput Vis Image Underst, 2007, 106: 59-70 Google Scholar

[25] Griffin G, Holub A, Perona P, et al. Caltech-256 Object Category Dataset. Technical Report. Pasadena: California Institute of Technology, 2007. Google Scholar

[26] Everingham M, van Gool L, Williams C K I, et al. Int J Comput Vis, 2010, 88: 303-338 Google Scholar

[27] Lowe D G. Int J Comput Vis, 2004, 60: 91-110 Google Scholar

[28] Wu J, Rehg J M. Beyond the euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: Proceedings of the 12th Computer Vision Conference. Kyoto: IEEE, 2009. 630--637. Google Scholar

[29] Bao C, Wu Y, Ling H, et al. Real time robust l1 tracker using accelerated proximal gradient approach. In: Proceedings of the 25th Computer Vision and Pattern Recognition Conference. Providence: IEEE, 2012. 1830--1837. Google Scholar

[30] Wright J, Yang A Y, Ganesh A, et al. IEEE Trans Pattern Anal Mach Intel, 2009, 31: 210-227 Google Scholar

[31] Zhang Q, Li B. Discriminative k-svd for dictionary learning in face recognition. In: Proceedings of the 23th Computer Vision and Pattern Recognition Conference. San Francisco: IEEE, 2010. 2691--2698. Google Scholar

[32] Zhang C J, Liu J, Liang C, et al. Comput Vis Image Underst, 2014, 123: 14-22 Google Scholar

[33] Perronnin F, Dance C. Fisher kernels on visual vocabularies for image categorization. In: Proceedings of the 20th Computer Vision and Pattern Recognition Conference. Minneapolis: IEEE, 2007. 1--8. Google Scholar

[34] Zhou X, Yu K, Zhang T, et al. Image classification using super-vector coding of local image descriptors. In: Proceedings of the 11th European Conference on Computer Vision. Berlin: Springer , 2010. 6315: 141--154. Google Scholar

[35] Chatfield K, Lempitsky V, Vedaldi A, et al. The devil is in the details: an evaluation of recent feature encoding methods. In: Proceedings of the 22nd British Machine Vision Conference. Dundee: BMVA Press, 2011. 1--12. Google Scholar

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1