The classification of hyperspectral images with a paucity of labeled samples is a challenging task. This paper describes the use of a superpose probability matrix and weight matrix of an L1 graph, thereby forming a strong discriminating DL1 graph. Combining the local information of the space with the global information of the spectrum through the superposition of a KNN graph and a DL1 graph, a graph-based framework is built that combines the spatial and spectral information. This framework of a DL1KNN graph can reflect the more sophisticated structure of hyperspectral image data. Experimental results show that the improvement in classification accuracy is significant when the percentage of labeled samples is 5% through the use of the label propagation of the graph to achieve semi-supervised classification for improving the automatic classification accuracy of hyperspectral data with a small number of samples.
国家自然科学基金(61461002)
宁夏自然科学基金(NZ15105)
北方民族大学校级科研项目(JSKY06)
北方民族大学研究生创新项目(YCX1657)
[1] Haq Q S U, Shi L X, Tao L M. Classification of hyperspectral data using a few training samples. In: Proceedings of IEEE International Conference on Progress in Informatics and Computing, Shanghai, 2010. 220--224. Google Scholar
[2] Zhang C M, Hao X T, Bai J, et al. Small samples hyperspectral data classification using a sparse based new model with learning dictionary. In: Proceedings of the 6th Workshop on Hyperspectral Image and Signal Processing, Shanghai, 2014. Google Scholar
[3] Zhou Q Q, Zhang C M, Zhang Y B. Relationship between the number of labeled samples and classification accuracy based on sparse representation. Proc Sci, 2015, 1: 97--103. Google Scholar
[4] Gao Y, Ji R, Cui P, et al. Hyperspectral image classification through bilayer graph based learning. IEEE Trans Image Process, 2014, 23: 2769--2778. Google Scholar
[5] Zhu X. Semi-supervised learning. In: Encyclop-Edia of Machine Learning. Berlin: Springer, 2010. Google Scholar
[6] Liu W, He J, Chang S F. Large graph construction for scalable semi-supervised learning. In: Proceedings of the 27th International Conference on Machine Learning, Haifa, 2010. 679--686. Google Scholar
[7] Zhou D, Bousquet O, Lal T N, et al. Learning with local and global consistency. In: Proceedings of Conference on Advance Neural Information Processing System, Tuebingen, 2004. 16: 321--328. Google Scholar
[8] Zhu X, Lafferty J, Ghahramani Z. Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference of Machine Learning, Washington, 2003. 912--919. Google Scholar
[9] Cheng H, Liu Z, Yang J. Sparsity induced similarity measure for label propagation. In: Proceedings of IEEE 12th International Conference of Computer Vision, Kyoto, 2009. 317--324. Google Scholar
[10] Yan S, Wang H. Semi-supervised learning by sparse representation. In: Proceedings of SIAM International Conference of Data Mining, Sparks, 2009. 792--801. Google Scholar
[11] Roweis S T, Saul L K. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290: 2323--2326. Google Scholar
[12] Wang F, Zhang C. Label propagation through linear neighborhoods. IEEE Trans Knowl Data Eng, 2008, 20: 55--67. Google Scholar
[13] Zhang Y B, Zhang C M, Zhou Q Q, et al. Semi-supervised classification algorithm based on l1-norm and KNN superposition graph. Pattern Recogn Artif Intell, 2016, 29: 850--855. Google Scholar
[14] Cheng B, Yang J, Yan S. Learning with l1 graph for image analysis. Image Process, 2010, 19: 858--866. Google Scholar
[15] Wright J, Yang A Y, Ganesh A, et al. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell, 2009, 31: 210--227. Google Scholar
[16] Han S, Huang H, Qin H. Locality-preserving l1 graph and its application in clustering. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, New York, 2015. 813--818. Google Scholar
[17]
Koh K, Kim S J, Boyd S. l1_ls: a matlab solver for large-scale l1-regularized least squares problems. Stanford: Stanford University, 2008.
[18] Koh K, Kim S, Boyd S. A method for large-scale l1-regularized logistic regression. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence, Vancouver, 2007. 565--571. Google Scholar
Figure 1
(a) Pseudo color image and (b) four kinds of vegetation of the sub scene of AVIRIS Indiana Pines
Figure 2
The overall classification accuracy comparison chart
Figure 3
Classification accuracies of sub scene of Indiana Pines, the percentage of labeled samples is 5%, $k$=5. (a) L1 graph; (b) DL1 graph; (c) L1KNN graph; (d) DL1KNN graph
Figure 4
Classification accuracies of sub scene of Indiana Pines, the percentage of labeled samples is 25%, $k$=5. (a) L1 graph; (b) DL1 graph; (c) L1KNN graph; (d) DL1KNN graph
Figure 5
Confusion matrix of sub scene of Indiana Pines, the percentage of labeled samples is 25%, $k$=5. class1: Corn-notill, class2: Grass-trees, class3: Soybean-notill, class4: Soybean-mintill. (a) L1 graph; (b) DL1 graph; (c) L1KNN graph; (d) DL1KNN graph
Figure 6
(Color online) Overall accuracy and Kappa factor of Indiana Pines sub scene when percentage of labeled samples is 25%, $k=5$
Figure 7
(Color online) Accuracy per class of Indiana Pines sub scene when percentage of labeled samples is 25%, $k=5$
Figure 8
(Color online) Omission of Indiana Pines sub scene when percentage of labeled samples is 25%, $k=5$
Figure 9
(Color online) The curve graph of classification accuracy (%) with varying scale coefficient of graph under different percentages of labeled samples
Indiana Pines (%) | L1图 | DL1图 | L1KNN图 | DL1KNN图 | |||||||||||
$k=5$ | $k=8$ | $k=10$ | Average | $k=5$ | $k=8$ | $k=10$ | Average | ||||||||
5 | 0.792 | 0.891 | 0.883 | 0.818 | 0.876 | 0.906 | |||||||||
10 | 0.887 | 0.909 | 0.873 | 0.881 | 0.910 | 0.880 | |||||||||
15 | 0.932 | 0.904 | 0.898 | 0.913 | 0.909 | 0.918 | |||||||||
20 | 0.931 | 0.924 | 0.918 | 0.931 | 0.923 | 0.928 | |||||||||
25 | 0.939 | 0.916 | 0.923 | 0.948 | 0.920 | 0.923 |
输入高光谱图像, 其中$l$个标记样本$X_l=[x_1,x_2,\ldots,x_l]$, $u$个无标记样本$X_u=[x_{l+1},x_{l+2},\ldots,x_{l+u}]$, 初始的标记矩阵$Y_l\in~\mathbb{R}^{l~\times~c}$; |
预处理样本: 归一化样本$x_i=x_i/\Vert~{x_i}~\Vert_2$, 去掉样本$x_i$得到预处理样本$X=[x_1,x_2,\ldots,x_{i-1},x_{i+1},\ldots,x_n]$; |
通过式(2)得到L1范数图权值矩阵$W_{{\rm~L}1}=\{W_{ij}\}_{n\times~n}$; |
通过式(4)和(5)得到类概率矩阵$\{P_{ij}\}_{n\times~n}$; |
根据式(6)获得DL1图的权值矩阵$W_{(\rm~DL1)}$; |
通过式(7)得到K近邻矩阵$K=\{K_{ij}\}_{n~\times~n}$; |
通过式(8)将DL1图和KNN图叠加得到叠加矩阵$W_3$, 根据实验设置$\beta$的值为0.2; |
输出叠加图$G_3=(X,W_3~)$. |
$\beta$ (%) | 0 | 0.1 | 0.2 | 0.25 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1 | 10 | 100 |
5 | 74.4 | 84.4 | 85.5 | 84.9 | 85.3 | 85.2 | 83.2 | 82.4 | 82.8 | 83.2 | 84.4 | 83.0 | 80.5 | |
10 | 87.3 | 86.3 | 88.4 | 87.0 | 86.6 | 84.9 | 83.0 | 82.7 | 82.6 | 82.4 | 82.5 | 84.9 | 84.2 | |
15 | 86.8 | 88.8 | 88.8 | 88.0 | 88.9 | 86.8 | 85.1 | 84.1 | 84.2 | 83.5 | 83.8 | 81.9 | 83.3 | |
20 | 87.1 | 90.6 | 90.6 | 90.9 | 89.7 | 89.5 | 88.1 | 87.2 | 87.9 | 87.6 | 87.9 | 90.6 | 91.7 | |
25 | 88.0 | 89.0 | 93.0 | 90.0 | 88.0 | 89.0 | 88.5 | 88.7 | 87.0 | 87.5 | 86.7 | 92.3 | 89.3 |