logo

SCIENTIA SINICA Informationis, Volume 47, Issue 8: 1095-1108(2017) https://doi.org/10.1360/N112016-00278

User age prediction by combining classification and regression}{User age prediction by combining classification and regression

More info
  • ReceivedMar 30, 2017
  • AcceptedJun 8, 2017
  • PublishedAug 16, 2017

Abstract

Age classification and age regression are two main approaches of age prediction, and both approaches have their respective advantages. For example, age classification can flexibly utilize distinguished model in machine learning while the main advantage of age regression is its ability to capture the relationship between different ages. In order to utilize advantages of age classification and age regression simultaneously, we propose a hybrid age prediction approach that combines classification and regression. First, we build the long short-term memory (LSTM) models of age regression and age classification respectively for age prediction. Then, we linearly combine the results of the age classifier and age regressor as the final result of age prediction. Empirical evaluations demonstrate that the proposed hybrid model effectively improves the performance.


Funded by

国家自然科学基金(61331011)

国家自然科学基金(61375073)

国家自然科学基金(61672366)


References

[1] Preotiuc-Pietro D, Lampos V, Aletras N. An analysis of the user occupational class through twitter content. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Pennsylvania: Association for Computational Linguistics, 2015. 1754-1764. Google Scholar

[2] Volkova S, Wilson T, Yarowsky D. Exploring demographic language variations to improve multilingual sentiment analysis in social media. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Pennsylvania: Association for Computational Linguistics, 2013. 1815-1827. Google Scholar

[3] O'Connor B, Balasubramanyan R, Routledge B R, et al. From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the 4th International Conference on Weblogs and Social Media. California: AAAI Press, 2010. 1842-1850. Google Scholar

[4] Schler J, Koppel M, Argamon S, et al. Effects of age and gender on blogging. Front Inform Tech Electron Eng, 2006, 274: 199-205. Google Scholar

[5] Burger J D, Henderson J C. An exploration of observable features related to blogger age. In: Proceedings of the 2006 AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs. California: AAAI Press, 2006. 15-20. Google Scholar

[6] Nguyen D, Smith N A, Rose C. Author age prediction from text using liner regression. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. Pennsylvania: Association for Computational Linguistics, 2011. 115-123. Google Scholar

[7] Nguyen D, Gravel R, Trieschnigg D, et al. ``How old do you think I am?": a study of language and age in twitter. In: Proceedings of the 7th International Conference on Weblogs and Social Media. California: AAAI Press, 2013. 439-448. Google Scholar

[8] Tang D, Qin B, Liu T. Aspect level sentiment classification with deep memory network. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Pennsylvania: Association for Computational Linguistics, 2016. 214-224. Google Scholar

[9] Barone A V M, Attardi G. Non-projective dependency-based pre-reordering with recurrent neural network for machine translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Pennsylvania: Association for Computational Linguistics, 2015. 846-856. Google Scholar

[10] Ikeda D, Takamura H, Okumura M. Semi-supervised learning for blog classification. In: Proceedings of the 23rd AAAI Conference on Artificial intelligence. California: AAAI Press, 2008. 1156-1164. Google Scholar

[11] Rosenthal S, McKeown K. Age prediction in blogs: a study of style, content, and online behavior in pre- and post-social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, 2011. 763-772. Google Scholar

[12] Mackinnon I, Warren R H. Statistical Network Analysis: Models, Issues, and New Directions. Berlin: Springer, 2006. Google Scholar

[13] Peersman C, Daelemans W, Vaerenbergh L V. Predicting age and gender in online social networks. In: Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents. New York: ACM, 2011. 37-44. Google Scholar

[14] Marquardt J, Farnadi G, Vasudevan G, et al. Age and gender identification in social media. In: Proceedings of the 5th Conference and Labs of the Evaluation Forum (CLEF 2014), Sheffield, 2014. 1129-1136. Google Scholar

[15] Chen J, Li S S, Dai B, et al. Active learning for age regression in social media. In: Proceedings of China National Conference on Chinese Computational Linguistics. Berlin: Springer, 2016. 351-362. Google Scholar

[16] Hochreiter S, Schmidhuber J. Long Short-Term Memory. \href{https://doi.org/10.1162/neco.1997.9.8.1735}{Neural Computation}, 1997, 9: 1735-1780. Google Scholar

[17] Graves A. Generating sequences with recurrent neural networks, arXiv: \href{https://arxiv.org/abs/1308.0850, 2013}{1308.0850, 2013}. Google Scholar

[18] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors. Comput Sci, 2012, 3: 212-223. Google Scholar

[19] LeCun Y A, Bottou L, Orr G B, et al. Efficient backprop. Neur Net Tricks Trade, 2012, 1524: 9-50. Google Scholar

[20] Cameron A C, Windmeijer F A G. R-squared measures for count data regression models with applications to health-care utilization. J Bus Econ Stat, 1996, 14: 209-220. Google Scholar

[21] Johnson R, Zhang T. Effective use of word order for text categorization with convolutional neural networks. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics. Pennsylvania: Association for Computational Linguistics, 2015. 103-112. Google Scholar

[22] Agarwal B, Sharma V K, Mittal N. Sentiment classification of review documents using phrase patterns. In: Proceedings of International Conference on Advances in Computing, Communications and Informatics. New York: IEEE, 2013. 1577-1580. Google Scholar

[23] Elkouri A. Predicting the sentiment polarity and rating of yelp reviews, arXiv: \href{https://arxiv.org/abs/1512.06303, 2015}{1512.06303, 2015}. Google Scholar

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1