The sparse representation based classification methods has achieved significant performance in recent years. To fully exploit both the holistic and locality information of face samples, a series of sparse representation based methods in spatial pyramid structure have been proposed. However, there are still some limitations for these sparse representation methods in spatial pyramid structure. Firstly, all the spatial patches in these methods are directly aggregated with same weights, ignoring the differences of patches reliability. Secondly, all these methods are not quite robust to poses, expression and misalignment variations, especially in under-sampled cases. In this paper, a novel method named robust sparse representation based classification in an adaptive weighted spatial pyramid structureRSRC-ASP) is proposed. RSRC-ASP builds a spatial pyramid structure for sparse representation based classification with a self-adaptive weighting strategy for residuals aggregation. In addition, three strategies, local-neighbourhood representation local intra-class Bayesian residual criterion and local auxiliary dictionary are exploited to enhance the robustness of RSRC-ASP. Experiments on various data sets show that RSRC-ASP outperforms the classical sparse representation based classification methods especially for under-sampled face recognition problems.
This work was supported by National Natural Science Foundation of China (Grant Nos. 61333015, 61302127, 11326198), China Postdoctoral Science Foundation (Grant No. 2015M570228), Opening Foundation of Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin Key Projects in the National Science and Technology Pillar Program (Grant No. 14ZCZDGX00033), and International Fostering Plan of Selected Excellent Postdoctors Subsidized by Tianjin City (2015). Thanks for the academic visiting support of the China Scholarship Council.
[1] Basri R, Jacobs D. Lambertian reflectance and linear subspaces. Pattern Anal Mach Intell, 2003, 25: 218--233. Google Scholar
[2] Qiao L, Chen S, Tan X. Sparsity preserving discriminant analysis for single training image face recognition. Pattern Recogn Lett, 2010, 31: 422--429. Google Scholar
[3] Candès E, Romberg J, Tao T. Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math, 2006, 59: 1207--1223. Google Scholar
[4] Donoho D. For most large underdetermined systems of linear equations the minimal 1 -norm solution is also the sparsest solution. Commun Pure Appl Math, 2006, 59: 797--829. Google Scholar
[5] Wright J, Yang A, Ganesh A, et al. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell, 2009, 31: 210--227. Google Scholar
[6] Zhang L, Yang M, Feng X. Sparse representation or collaborative representation: which helps face recognition? In: Proceedings of International Conference on Computer Vision (ICCV), Barcelona, 2011. 471--478. Google Scholar
[7] Yang M, Zhang L. Gabor feature based sparse representation for face recognition with gabor occlusion dictionary. In: Proceedings of the 11th European Conference on Computer Vision (ECCV), Heraklion, 2010. 448--461. Google Scholar
[8] Ma L, Wang C, Xiao B, et al. Sparse representation for face recognition based on discriminative low-rank dictionary learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, 2012. 2586--2593. Google Scholar
[9] Deng W, Hu J, Guo J. Extended src: undersampled face recognition via intraclass variant dictionary. Pattern Anal Mach Intell, 2012, 34: 1864--1870. Google Scholar
[10] Deng W, Hu J, Guo J. In defense of sparsity based face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, 2013. 399--406. Google Scholar
[11] Wei C, Wang Y. Learning auxiliary dictionaries for undersampled face recognition. In: Proceedings of International Conference on Multimedia and Expo (ICME), San Jose, 2013. 1--6. Google Scholar
[12] Yang M, Zhang L, Feng X, et al. Fisher discrimination dictionary learning for sparse representation. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), Barcelona, 2011. 543--550. Google Scholar
[13] Feng J, Ma X, Zhuang W. Collaborative representation bayesian face recogntion. Sci China Inf Sci, 2017, 60: 048101. Google Scholar
[14] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. Siam J Imag Sci, 2009, 2: 183--202. Google Scholar
[15] Yang J, Zhang Y. Alternating direction algorithms for $\ell_1$-problems in compressive sensing. Siam J Sci Comput, 2009, 33: 250--278. Google Scholar
[16] Li J, Lu C. A new decision rule for sparse representation based classification for face recognition. Neurocomputing, 2013, 116: 265--271. Google Scholar
[17] Liu B, Shen B, Gui L, et al. Face recognition using class specific dictionary learning for sparse representation and collaborative representation. Neurocomputing, 2016, 204: 198--210. Google Scholar
[18] Lu Z, Zhang L. Face recognition algorithm based on discriminative dictionary learning and sparse representation. Neurocomputing, 2016, 174: 749--755. Google Scholar
[19] Ouyang Y, Sang N, Huang R. Accurate and robust facial expressions recognition by fusing multiple sparse representation based classifiers. Neurocomputing, 2015, 149: 71--78. Google Scholar
[20] Grauman K, Darrell T. The pyramid match kernel: discriminative classification with sets of image features. In: Proceedings of International Conference on Computer Vision (ICCV), Beijing, 2005. 1458--1465. Google Scholar
[21] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, 2006. 2169--2178. Google Scholar
[22] Yang J, Yu K, Gong Y, et al. Linear spatial pyramid matching using sparse coding for image classification. In: Prcoeedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, 2009. 1794--1801. Google Scholar
[23] Yu K, Lv F, Huang T, et al. Locality-constrained linear coding for image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, 2010. 3360--3367. Google Scholar
[24] Shen F, Tang Z, Xu J. Locality constrained representation based classification with spatial pyramid patches. Neurocomputing, 2013, 101: 104--115. Google Scholar
[25] Yang M, Zhang L, Shiu S C, et al. Robust kernel representation with statistical local features for face recognition. IEEE Trans Neural Netw Learn Syst, 2013, 24: 900--912. Google Scholar
[26] Zhu P, Yang M, Zhang L, et al. Local generic representation for face recognition with single sample per person. In: Proceedings of Asian Conference on Computer Vision (ACCV), Singapore, 2014. 34--50. Google Scholar
[27] Wang J, Yang J, Yu K, et al. Robust local representation for face recognition with single sample per person. In: Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, 2015. Google Scholar
[28] Boureau Y, Bach F, Lecun Y, et al. Learning mid-level features for recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2559--2566. Google Scholar
[29] Zhou X, Yu K, Zhang T, et al. Image classification using super-vector coding of local image descriptors. In: Proceedings of the 11th European Conference on Computer vision (ECCV): Part V, Heraklion, 2010. 141--154. Google Scholar
[30] Moghaddam B, Nastar C, Pentland A. Bayesian face recognition using deformable intensity surfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, 1996. 638--645. Google Scholar
[31] Moghaddam B, Jebara T, Pentland A. Bayesian face recognition. Pattern Recogn, 2000, 33: 1771--1782. Google Scholar
[32] Georghiades A, Belhumeur P, Kriegman D. From few to many: illumination cone models for face recognition under variable lighting and pose. Pattern Anal Mach Intell, 2001, 23: 643--660. Google Scholar
[33] Martinez A. The AR Face Database. Cvc Technical Report, 24. 1998. Google Scholar
[34] Gross R, Matthews I, Cohn J, et al. Multi-PIE. Image Vision Comput, 2010, 28: 807--813. Google Scholar
[35] Wolf L, Hassner T, Taigman Y. Similarity scores based on background samples. In: Proceedings of the 9th Asian Conference on Computer Vision (ACCV), Xian, 2009. 88--97. Google Scholar
[36] Yang M, Dai D, Shen L, et al. Latent dictionary learning for sparse representation based classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, 2014. 4138--4145. Google Scholar
Figure 1
(Color online) The illustration of ScSPMs framework.
Figure 2
(Color online) The illustration of the RSRC-ASPs framework.
Figure 3
(Color online) The representation coefficients over different patches.
Figure 4
(Color online) The geometric analysis of RSRC-ASP.
Figure 5
(Color online) The illustration of LNR strategy.
Figure 6
(Color online) Different local residual criterions over an unreliable patch (each of the stripe bars represents the residual criterion of the query images corresponding subject).
Figure 7
(Color online) The recognition results in extended Yale B.
Figure 8
(Color online) The recognition results in AR.
Figure 9
(Color online) The recognition results in CMU PIE.
Gallery | 1 | 2 | 4 | 6 | 8 |
ScSPM | 21.73 $\pm$ 2.78 | 41.31 $\pm$ 1.43 | 61.98 $\pm$ 1.21 | 74.78 $\pm$ 0.83 | 80.69 $\pm$ 0.72 |
SRC | 24.00 $\pm$ 1.98 | 46.50 $\pm$ 1.21 | 73.30 $\pm$ 0.93 | 87.00 $\pm$ 0.84 | 92.40 $\pm$ 0.43 |
LCRC | 43.30 $\pm$ 1.42 | 72.10 $\pm$ 1.14 | 95.00 $\pm$ 0.55 | 99.30 $\pm$ 0.16 | 99.80 $\pm$ 0.03 |
SLF+RKR | 32.83 $\pm$ 1.93 | 48.60 $\pm$ 1.62 | 73.95 $\pm$ 0.93 | 90.56 $\pm$ 0.81 | 92.40 $\pm$ 0.44 |
RLR | 43.61 $\pm$ 1.87 | 66.32 $\pm$ 1.74 | 88.52 $\pm$ 1.77 | 93.24 $\pm$ 1.66 | 94.60 $\pm$ 1.39 |
RSRC-ASP(Non) | 45.30 $\pm$ 0.97 | 74.10 $\pm$ 0.89 | 96.50 $\pm$ 0.32 | 99.40 $\pm$ 0.05 | 99.90 $\pm$ 0.02 |
RSRC-ASP(A) | 44.20 $\pm$ 1.59 | 75.35 $\pm$ 1.16 | 97.20 $\pm$ 0.43 | ||
RSRC-ASP(B) | 56.50 $\pm$ 0.83 | 81.50 $\pm$ 0.81 | 97.60 $\pm$ 0.38 | 99.30 $\pm$ 0.07 | 99.50 $\pm$ 0.04 |
RSRC-ASP(C) | 99.30 $\pm$ 0.18 | 99.90 $\pm$ 0.02 |
Gallery | 1 | 2 | 4 | 6 | 8 |
ScSPM | 17.40 $\pm$ 2.83 | 31.56 $\pm$ 1.42 | 52.13 $\pm$ 0.94 | 64.97 $\pm$ 0.97 | 73.37 $\pm$ 0.82 |
SRC | 25.01 $\pm$ 2.31 | 39.64 $\pm$ 1.38 | 56.59 $\pm$ 1.21 | 66.08 $\pm$ 0.84 | 72.56 $\pm$ 0.75 |
LCRC | 34.00 $\pm$ 1.46 | 55.00 $\pm$ 1.55 | 75.93 $\pm$ 1.09 | 85.53 $\pm$ 0.76 | 88.07 $\pm$ 0.32 |
SLF+RKR | 34.67 $\pm$ 1.35 | 59.76 $\pm$ 1.22 | 74.37 $\pm$ 0.96 | 82.70 $\pm$ 0.87 | 90.39 $\pm$ 0.44 |
RLR | 41.25 $\pm$ 2.03 | 63.89 $\pm$ 1.99 | 81.83 $\pm$ 1.41 | 87.04 $\pm$ 1.23 | 92.53 $\pm$ 0.83 |
RSRC-ASP(Non) | 36.13 $\pm$ 1.96 | 57.73 $\pm$ 1.41 | 78.20 $\pm$ 1.38 | 85.60 $\pm$ 0.86 | 89.13 $\pm$ 0.79 |
RSRC-ASP(A) | 38.23 $\pm$ 2.58 | 59.87 $\pm$ 2.12 | 80.50 $\pm$ 1.93 | 86.13 $\pm$ 1.48 | 89.90 $\pm$ 1.12 |
RSRC-ASP(B) | 42.47 $\pm$ 1.35 | 64.73 $\pm$ 1.21 | 85.07 $\pm$ 1.08 | 88.73 $\pm$ 0.83 | 91.73 $\pm$ 0.49 |
RSRC-ASP(C) |
Gallery | 1 | 2 | 4 | 6 | 8 |
ScSPM | 36.00 $\pm$ 2.01 | 60.10 $\pm$ 2.92 | 68.82 $\pm$ 2.66 | 74.15 $\pm$ 1.89 | 76.61 $\pm$ 2.03 |
SRC | 38.35 $\pm$ 2.64 | 59.80 $\pm$ 2.03 | 72.25 $\pm$ 1.71 | 78.20 $\pm$ 1.96 | 81.05 $\pm$ 1.80 |
LCRC | 48.50 $\pm$ 2.15 | 67.70 $\pm$ 3.03 | 77.65 $\pm$ 1.92 | 82.60 $\pm$ 1.77 | 84.25 $\pm$ 1.55 |
SLF+RKR | 42.31 $\pm$ 3.22 | 61.49 $\pm$ 2.56 | 71.27 $\pm$ 1.95 | 80.54 $\pm$ 1.96 | 83.95 $\pm$ 1.41 |
RLR | 60.87 $\pm$ 1.46 | 73.93 $\pm$ 1.25 | 83.37 $\pm$ 1.10 | 86.20 $\pm$ 0.85 | 88.25 $\pm$ 0.66 |
RSRC-ASP(Non) | 50.45 $\pm$ 2.59 | 69.30 $\pm$ 2.11 | 79.90 $\pm$ 1.90 | 83.70 $\pm$ 1.49 | 86.50 $\pm$ 1.52 |
RSRC-ASP(A) | 58.14 $\pm$ 1.85 | ||||
RSRC-ASP(B) | 56.50 $\pm$ 1.41 | 74.40 $\pm$ 1.34 | 83.10 $\pm$ 0.90 | 86.55 $\pm$ 0.65 | 88.45 $\pm$ 0.49 |
RSRC-ASP(C) | 75.85 $\pm$ 0.96 | 81.50 $\pm$ 0.85 | 85.13 $\pm$ 0.54 | 87.48 $\pm$ 0.48 |
Method | Recognition result | Method | Recognition result |
SRC | 72.7 $\pm$ 2.25 | ScSPM | 56.3 $\pm$ 3.62 |
LCRC | 74.2 $\pm$ 1.46 | SLF+RKR | 71.9 $\pm$ 1.35 |
RLR | 74.3 $\pm$ 0.63 | RSRC-ASP(Non) | 74.3 $\pm$ 0.41 |
RSRC-ASP(A) | 74.1 $\pm$ 0.22 | RSRC-ASP(B) | 75.6 $\pm$ 0.33 |
RSRC-ASP(C) |
Method | Efficiency (s) | Method | Efficiency (s) |
SRC | 1.094 | ScSPM | 1.191 |
SLF-RKR(L1) | 0.672 | RLR | 0.665 |
RSRC-ASP(Non) | RSRC-ASP(A) | 1.342 | |
RSRC-ASP(B) | 0.581 | RSRC-ASP(C) | 0.695 |
Auxiliary dictionarys size | |||||
Method | 50 | 100 | 200 | 300 | 500 |
RSRC-ASP(C) | 85.44 $\pm$ 0.96 | 86.53 $\pm$ 0.73 | 86.92 $\pm$ 0.61 | 87.10 $\pm$ 0.38 |
Pyramid structure | |||||
Method | $\lbrace~1\rbrace$ | $\lbrace~1,2,3\rbrace$ | $\lbrace~1,2,3,6\rbrace$ | $\lbrace~6\rbrace$ | $\lbrace~1,2,3,6,36\rbrace$ |
RSRC-ASP(Non) | 72.69 $\pm$ 1.89 | 86.90 $\pm$ 0.62 | 96.50 $\pm$ 0.32 | 94.86 $\pm$ 0.21 | 96.13 $\pm$ 0.11 |
RSRC-ASP(A) | 72.69 $\pm$ 1.89 | 87.04 $\pm$ 0.88 | 97.20 $\pm$ 0.43 | 94.44 $\pm$ 0.18 | 96.27 $\pm$ 0.21 |
RSRC-ASP(B) | 78.58 $\pm$ 0.65 | 89.00 $\pm$ 0.71 | 97.60 $\pm$ 0.38 | 95.81 $\pm$ 0.06 | 97.69 $\pm$ 0.18 |
RSRC-ASP(C) | 82.85 $\pm$ 0.72 | 93.21 $\pm$ 0.21 | 98.34 $\pm$ 0.55 | 97.52 $\pm$ 0.13 | 98.43 $\pm$ 0.08 |