SCIENTIA SINICA Informationis, Volume 48, Issue 5: 521-530(2018) https://doi.org/10.1360/N112018-00029

Label distribution learning and label enhancement

Xin GENG1,2,*, Ning XU1,2
More info
  • ReceivedFeb 6, 2018
  • AcceptedApr 11, 2018
  • PublishedMay 11, 2018


This paper introduces the concepts and algorithms for label distribution learning (LDL) and label enhancement. LDL is a general machine learning paradigm with traditional single-label learning and multi-label learning as its special cases. A label distribution covers a certain number of labels, representing the degree to which each label describes the instance. Thus, LDL has been successfully applied to many real-world problems. Unfortunately, many existing datasets only have simple logical labels rather than label distributions. One way to solve the problem is to transform the logical labels into label distributions by mining the latent label importance from the training examples. Such a process of transforming logical labels into label distributions is defined as label enhancement. This paper provides formal definitions of label distribution learning and label enhancement. Subsequently, six representative LDL algorithms and four typical LE algorithms are briefly introduced and comparatively analyzed.

Funded by





[1] Tsoumakas G, Katakis I. Multi-label classification: an overview. Int J Data Warehousing Min, 2007, 3: 1-13 CrossRef Google Scholar

[2] Geng X, Luo L. Multilabel ranking with inconsistent rankers. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, Columbus, 2014. 3742--3747. Google Scholar

[3] Zhou Y, Xue H, Geng X. Emotion distribution recognition from facial expressions. In: Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, 2015. 1247--1250. Google Scholar

[4] Geng X. Label distribution learning. IEEE Trans Knowl Data Eng, 2016, 28: 1734-1748 CrossRef Google Scholar

[5] Zhou W J, Yu Y, Zhang M L. Binary linear compression for multi-label classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, 2017. 3546--3552. Google Scholar

[6] Gao B B, Xing C, Xie C W. Deep label distribution learning with label ambiguity. IEEE Trans Image Process, 2017, 26: 2825-2838 CrossRef PubMed ADS arXiv Google Scholar

[7] Wu T F, Lin C J, Weng R C. Probability estimates for multiclass classification by pairwise coupling. J Mach Learn Res, 2004, 5: 975--1005. Google Scholar

[8] Lin H T, Lin C J, Weng R C. A note on Platt's probabilistic outputs for support vector machines. Mach Learn, 2007, 68: 267-276 CrossRef Google Scholar

[9] Berger A L, Pietra S D, Pietra V J D. A maximum entropy approach to natural language processing. Comput Linguist, 1996, 22: 39--71. Google Scholar

[10] Pietra S D, Pietra V D, Lafferty J D. Inducing features of random fields. IEEE Trans Pattern Anal Machine Intel, 1997, 19: 380-393 CrossRef Google Scholar

[11] Nocedal J, Wright S. Numerical Optimization. 2nd ed. New York: Springer, 2006. Google Scholar

[12] Gayar N E, Schwenker F, Palm G. A study of the robustness of KNN classifiers trained using soft labels. In: Proceedings of the 2nd Conference Artificial Neural Networks in Pattern Recognition, Berlin, 2006. 67--80. Google Scholar

[13] Jiang X F, Yi Z, Lv J C. Fuzzy SVM with a new fuzzy membership function. Neural Comput Appl, 2006, 15: 268-276 CrossRef Google Scholar

[14] Lin X T, Chen X W. Mr.KNN: soft relevance for multi-label classification. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, New York, 2010. 349--358. Google Scholar

[15] Jiang J Y, Tsai S C, Lee S J. FSKNN: multi-label text categorization based on fuzzy similarity and k nearest neighbors. Expert Syst Appl, 2012, 39: 2813-2821 CrossRef Google Scholar

[16] Klir J G, Yuan B. Fuzzy Sets and Fuzzy Logic: Theory and Applications. Upper Saddle River: Prentice Hall, 1995. Google Scholar

[17] Li Y K, Zhang M L, Geng X. Leveraging implicit relative labeling-importance information for effective multi-label learning. In: Proceedings of IEEE International Conference on Data Mining, Piscataway, 2015. 251--260. Google Scholar

[18] Zhu X J, Goldberg A B. Introduction to Semi-Supervised Learning. Boca Raton: Morgan and Claypool Publishers, 2009. Google Scholar

[19] Hou P, Geng X, Zhang M L. Multi-label manifold learning. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, Menlo Park, 2016. 1680--1686. Google Scholar

[20] Zhu X J. Semi-supervised learning with graphs. Dissertation for Ph.D. Degree. Pittsburgh: Carnegie Mellon University, 2005. Google Scholar

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1       京公网安备11010102003388号