SCIENCE CHINA Information Sciences, Volume 61, Issue 1: 012101(2018) https://doi.org/10.1007/s11432-016-9009-6

## Robust sparse representation based face recognition in an adaptive weighted spatial pyramid structure

• AcceptedJan 10, 2017
• PublishedMay 22, 2017
Share
Rating

### Abstract

The sparse representation based classification methods has achieved significant performance in recent years. To fully exploit both the holistic and locality information of face samples, a series of sparse representation based methods in spatial pyramid structure have been proposed. However, there are still some limitations for these sparse representation methods in spatial pyramid structure. Firstly, all the spatial patches in these methods are directly aggregated with same weights, ignoring the differences of patches reliability. Secondly, all these methods are not quite robust to poses, expression and misalignment variations, especially in under-sampled cases. In this paper, a novel method named robust sparse representation based classification in an adaptive weighted spatial pyramid structureRSRC-ASP) is proposed. RSRC-ASP builds a spatial pyramid structure for sparse representation based classification with a self-adaptive weighting strategy for residuals aggregation. In addition, three strategies, local-neighbourhood representation local intra-class Bayesian residual criterion and local auxiliary dictionary are exploited to enhance the robustness of RSRC-ASP. Experiments on various data sets show that RSRC-ASP outperforms the classical sparse representation based classification methods especially for under-sampled face recognition problems.

### Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant Nos. 61333015, 61302127, 11326198), China Postdoctoral Science Foundation (Grant No. 2015M570228), Opening Foundation of Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin Key Projects in the National Science and Technology Pillar Program (Grant No. 14ZCZDGX00033), and International Fostering Plan of Selected Excellent Postdoctors Subsidized by Tianjin City (2015). Thanks for the academic visiting support of the China Scholarship Council.

### References

[1] Basri R, Jacobs D. Lambertian reflectance and linear subspaces. Pattern Anal Mach Intell, 2003, 25: 218--233. Google Scholar

[2] Qiao L, Chen S, Tan X. Sparsity preserving discriminant analysis for single training image face recognition. Pattern Recogn Lett, 2010, 31: 422--429. Google Scholar

[3] Candès E, Romberg J, Tao T. Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math, 2006, 59: 1207--1223. Google Scholar

[4] Donoho D. For most large underdetermined systems of linear equations the minimal 1 -norm solution is also the sparsest solution. Commun Pure Appl Math, 2006, 59: 797--829. Google Scholar

[5] Wright J, Yang A, Ganesh A, et al. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell, 2009, 31: 210--227. Google Scholar

[6] Zhang L, Yang M, Feng X. Sparse representation or collaborative representation: which helps face recognition? In: Proceedings of International Conference on Computer Vision (ICCV), Barcelona, 2011. 471--478. Google Scholar

[7] Yang M, Zhang L. Gabor feature based sparse representation for face recognition with gabor occlusion dictionary. In: Proceedings of the 11th European Conference on Computer Vision (ECCV), Heraklion, 2010. 448--461. Google Scholar

[8] Ma L, Wang C, Xiao B, et al. Sparse representation for face recognition based on discriminative low-rank dictionary learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, 2012. 2586--2593. Google Scholar

[9] Deng W, Hu J, Guo J. Extended src: undersampled face recognition via intraclass variant dictionary. Pattern Anal Mach Intell, 2012, 34: 1864--1870. Google Scholar

[10] Deng W, Hu J, Guo J. In defense of sparsity based face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, 2013. 399--406. Google Scholar

[11] Wei C, Wang Y. Learning auxiliary dictionaries for undersampled face recognition. In: Proceedings of International Conference on Multimedia and Expo (ICME), San Jose, 2013. 1--6. Google Scholar

[12] Yang M, Zhang L, Feng X, et al. Fisher discrimination dictionary learning for sparse representation. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), Barcelona, 2011. 543--550. Google Scholar

[13] Feng J, Ma X, Zhuang W. Collaborative representation bayesian face recogntion. Sci China Inf Sci, 2017, 60: 048101. Google Scholar

[14] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. Siam J Imag Sci, 2009, 2: 183--202. Google Scholar

[15] Yang J, Zhang Y. Alternating direction algorithms for $\ell_1$-problems in compressive sensing. Siam J Sci Comput, 2009, 33: 250--278. Google Scholar

[16] Li J, Lu C. A new decision rule for sparse representation based classification for face recognition. Neurocomputing, 2013, 116: 265--271. Google Scholar

[17] Liu B, Shen B, Gui L, et al. Face recognition using class specific dictionary learning for sparse representation and collaborative representation. Neurocomputing, 2016, 204: 198--210. Google Scholar

[18] Lu Z, Zhang L. Face recognition algorithm based on discriminative dictionary learning and sparse representation. Neurocomputing, 2016, 174: 749--755. Google Scholar

[19] Ouyang Y, Sang N, Huang R. Accurate and robust facial expressions recognition by fusing multiple sparse representation based classifiers. Neurocomputing, 2015, 149: 71--78. Google Scholar

[20] Grauman K, Darrell T. The pyramid match kernel: discriminative classification with sets of image features. In: Proceedings of International Conference on Computer Vision (ICCV), Beijing, 2005. 1458--1465. Google Scholar

[21] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), New York, 2006. 2169--2178. Google Scholar

[22] Yang J, Yu K, Gong Y, et al. Linear spatial pyramid matching using sparse coding for image classification. In: Prcoeedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, 2009. 1794--1801. Google Scholar

[23] Yu K, Lv F, Huang T, et al. Locality-constrained linear coding for image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, 2010. 3360--3367. Google Scholar

[24] Shen F, Tang Z, Xu J. Locality constrained representation based classification with spatial pyramid patches. Neurocomputing, 2013, 101: 104--115. Google Scholar

[25] Yang M, Zhang L, Shiu S C, et al. Robust kernel representation with statistical local features for face recognition. IEEE Trans Neural Netw Learn Syst, 2013, 24: 900--912. Google Scholar

[26] Zhu P, Yang M, Zhang L, et al. Local generic representation for face recognition with single sample per person. In: Proceedings of Asian Conference on Computer Vision (ACCV), Singapore, 2014. 34--50. Google Scholar

[27] Wang J, Yang J, Yu K, et al. Robust local representation for face recognition with single sample per person. In: Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, 2015. Google Scholar

[28] Boureau Y, Bach F, Lecun Y, et al. Learning mid-level features for recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2559--2566. Google Scholar

[29] Zhou X, Yu K, Zhang T, et al. Image classification using super-vector coding of local image descriptors. In: Proceedings of the 11th European Conference on Computer vision (ECCV): Part V, Heraklion, 2010. 141--154. Google Scholar

[30] Moghaddam B, Nastar C, Pentland A. Bayesian face recognition using deformable intensity surfaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, 1996. 638--645. Google Scholar

[31] Moghaddam B, Jebara T, Pentland A. Bayesian face recognition. Pattern Recogn, 2000, 33: 1771--1782. Google Scholar

[32] Georghiades A, Belhumeur P, Kriegman D. From few to many: illumination cone models for face recognition under variable lighting and pose. Pattern Anal Mach Intell, 2001, 23: 643--660. Google Scholar

[33] Martinez A. The AR Face Database. Cvc Technical Report, 24. 1998. Google Scholar

[34] Gross R, Matthews I, Cohn J, et al. Multi-PIE. Image Vision Comput, 2010, 28: 807--813. Google Scholar

[35] Wolf L, Hassner T, Taigman Y. Similarity scores based on background samples. In: Proceedings of the 9th Asian Conference on Computer Vision (ACCV), Xian, 2009. 88--97. Google Scholar

[36] Yang M, Dai D, Shen L, et al. Latent dictionary learning for sparse representation based classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, 2014. 4138--4145. Google Scholar

• Figure 1

(Color online) The illustration of ScSPMs framework.

• Figure 2

(Color online) The illustration of the RSRC-ASPs framework.

• Figure 3

(Color online) The representation coefficients over different patches.

• Figure 4

(Color online) The geometric analysis of RSRC-ASP.

• Figure 5

(Color online) The illustration of LNR strategy.

• Figure 6

(Color online) Different local residual criterions over an unreliable patch (each of the stripe bars represents the residual criterion of the query images corresponding subject).

• Figure 7

(Color online) The recognition results in extended Yale B.

• Figure 8

(Color online) The recognition results in AR.

• Figure 9

(Color online) The recognition results in CMU PIE.

• Table 1   The recognition results ($%$) in extended Yale B
 Gallery 1 2 4 6 8 ScSPM 21.73 $\pm$ 2.78 41.31 $\pm$ 1.43 61.98 $\pm$ 1.21 74.78 $\pm$ 0.83 80.69 $\pm$ 0.72 SRC 24.00 $\pm$ 1.98 46.50 $\pm$ 1.21 73.30 $\pm$ 0.93 87.00 $\pm$ 0.84 92.40 $\pm$ 0.43 LCRC 43.30 $\pm$ 1.42 72.10 $\pm$ 1.14 95.00 $\pm$ 0.55 99.30 $\pm$ 0.16 99.80 $\pm$ 0.03 SLF+RKR 32.83 $\pm$ 1.93 48.60 $\pm$ 1.62 73.95 $\pm$ 0.93 90.56 $\pm$ 0.81 92.40 $\pm$ 0.44 RLR 43.61 $\pm$ 1.87 66.32 $\pm$ 1.74 88.52 $\pm$ 1.77 93.24 $\pm$ 1.66 94.60 $\pm$ 1.39 RSRC-ASP(Non) 45.30 $\pm$ 0.97 74.10 $\pm$ 0.89 96.50 $\pm$ 0.32 99.40 $\pm$ 0.05 99.90 $\pm$ 0.02 RSRC-ASP(A) 44.20 $\pm$ 1.59 75.35 $\pm$ 1.16 97.20 $\pm$ 0.43 99.40 $\pm$ 0.06 99.90 $\pm$ 0.05 RSRC-ASP(B) 56.50 $\pm$ 0.83 81.50 $\pm$ 0.81 97.60 $\pm$ 0.38 99.30 $\pm$ 0.07 99.50 $\pm$ 0.04 RSRC-ASP(C) 60.24 $\pm$ 1.23 85.67 $\pm$ 1.35 98.34 $\pm$ 0.55 99.30 $\pm$ 0.18 99.90 $\pm$ 0.02
• Table 2   The recognition results ($%$) in AR
 Gallery 1 2 4 6 8 ScSPM 17.40 $\pm$ 2.83 31.56 $\pm$ 1.42 52.13 $\pm$ 0.94 64.97 $\pm$ 0.97 73.37 $\pm$ 0.82 SRC 25.01 $\pm$ 2.31 39.64 $\pm$ 1.38 56.59 $\pm$ 1.21 66.08 $\pm$ 0.84 72.56 $\pm$ 0.75 LCRC 34.00 $\pm$ 1.46 55.00 $\pm$ 1.55 75.93 $\pm$ 1.09 85.53 $\pm$ 0.76 88.07 $\pm$ 0.32 SLF+RKR 34.67 $\pm$ 1.35 59.76 $\pm$ 1.22 74.37 $\pm$ 0.96 82.70 $\pm$ 0.87 90.39 $\pm$ 0.44 RLR 41.25 $\pm$ 2.03 63.89 $\pm$ 1.99 81.83 $\pm$ 1.41 87.04 $\pm$ 1.23 92.53 $\pm$ 0.83 RSRC-ASP(Non) 36.13 $\pm$ 1.96 57.73 $\pm$ 1.41 78.20 $\pm$ 1.38 85.60 $\pm$ 0.86 89.13 $\pm$ 0.79 RSRC-ASP(A) 38.23 $\pm$ 2.58 59.87 $\pm$ 2.12 80.50 $\pm$ 1.93 86.13 $\pm$ 1.48 89.90 $\pm$ 1.12 RSRC-ASP(B) 42.47 $\pm$ 1.35 64.73 $\pm$ 1.21 85.07 $\pm$ 1.08 88.73 $\pm$ 0.83 91.73 $\pm$ 0.49 RSRC-ASP(C) 51.27 $\pm$ 1.82 69.07 $\pm$ 1.91 87.13 $\pm$ 1.50 90.73 $\pm$ 0.94 93.53 $\pm$ 0.66
• Table 3   The recognition results ($%$) in CMU PIE
 Gallery 1 2 4 6 8 ScSPM 36.00 $\pm$ 2.01 60.10 $\pm$ 2.92 68.82 $\pm$ 2.66 74.15 $\pm$ 1.89 76.61 $\pm$ 2.03 SRC 38.35 $\pm$ 2.64 59.80 $\pm$ 2.03 72.25 $\pm$ 1.71 78.20 $\pm$ 1.96 81.05 $\pm$ 1.80 LCRC 48.50 $\pm$ 2.15 67.70 $\pm$ 3.03 77.65 $\pm$ 1.92 82.60 $\pm$ 1.77 84.25 $\pm$ 1.55 SLF+RKR 42.31 $\pm$ 3.22 61.49 $\pm$ 2.56 71.27 $\pm$ 1.95 80.54 $\pm$ 1.96 83.95 $\pm$ 1.41 RLR 60.87 $\pm$ 1.46 73.93 $\pm$ 1.25 83.37 $\pm$ 1.10 86.20 $\pm$ 0.85 88.25 $\pm$ 0.66 RSRC-ASP(Non) 50.45 $\pm$ 2.59 69.30 $\pm$ 2.11 79.90 $\pm$ 1.90 83.70 $\pm$ 1.49 86.50 $\pm$ 1.52 RSRC-ASP(A) 58.14 $\pm$ 1.85 76.33 $\pm$ 1.71 84.60 $\pm$ 1.41 88.26 $\pm$ 0.96 89.90 $\pm$ 0.77 RSRC-ASP(B) 56.50 $\pm$ 1.41 74.40 $\pm$ 1.34 83.10 $\pm$ 0.90 86.55 $\pm$ 0.65 88.45 $\pm$ 0.49 RSRC-ASP(C) 65.48 $\pm$ 1.31 75.85 $\pm$ 0.96 81.50 $\pm$ 0.85 85.13 $\pm$ 0.54 87.48 $\pm$ 0.48
• Table 4   Recognition results ($%$) in LFW
 Method Recognition result Method Recognition result SRC 72.7 $\pm$ 2.25 ScSPM 56.3 $\pm$ 3.62 LCRC 74.2 $\pm$ 1.46 SLF+RKR 71.9 $\pm$ 1.35 RLR 74.3 $\pm$ 0.63 RSRC-ASP(Non) 74.3 $\pm$ 0.41 RSRC-ASP(A) 74.1 $\pm$ 0.22 RSRC-ASP(B) 75.6 $\pm$ 0.33 RSRC-ASP(C) 76.2 $\pm$ 0.18
• Table 5   Efficiency results (averaged running time per sample)
 Method Efficiency (s) Method Efficiency (s) SRC 1.094 ScSPM 1.191 SLF-RKR(L1) 0.672 RLR 0.665 RSRC-ASP(Non) 0.423 RSRC-ASP(A) 1.342 RSRC-ASP(B) 0.581 RSRC-ASP(C) 0.695
• Table 6   The recognition results ($%$) via local auxiliary dictionarys sizes
 Auxiliary dictionarys size Method 50 100 200 300 500 RSRC-ASP(C) 85.44 $\pm$ 0.96 86.53 $\pm$ 0.73 86.92 $\pm$ 0.61 87.18 $\pm$ 0.55 87.10 $\pm$ 0.38
• Table 7   The recognition results ($%$) via different spatial pyramid structure
 Pyramid structure Method $\lbrace~1\rbrace$ $\lbrace~1,2,3\rbrace$ $\lbrace~1,2,3,6\rbrace$ $\lbrace~6\rbrace$ $\lbrace~1,2,3,6,36\rbrace$ RSRC-ASP(Non) 72.69 $\pm$ 1.89 86.90 $\pm$ 0.62 96.50 $\pm$ 0.32 94.86 $\pm$ 0.21 96.13 $\pm$ 0.11 RSRC-ASP(A) 72.69 $\pm$ 1.89 87.04 $\pm$ 0.88 97.20 $\pm$ 0.43 94.44 $\pm$ 0.18 96.27 $\pm$ 0.21 RSRC-ASP(B) 78.58 $\pm$ 0.65 89.00 $\pm$ 0.71 97.60 $\pm$ 0.38 95.81 $\pm$ 0.06 97.69 $\pm$ 0.18 RSRC-ASP(C) 82.85 $\pm$ 0.72 93.21 $\pm$ 0.21 98.34 $\pm$ 0.55 97.52 $\pm$ 0.13 98.43 $\pm$ 0.08
• #### 2

Citations

• Altmetric

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有