logo

SCIENCE CHINA Information Sciences, Volume 60, Issue 1: 012109(2017) https://doi.org/10.1007/s11432-015-0900-x

Phrase-based hashtag recommendation for microblog posts

More info
  • ReceivedMay 15, 2016
  • AcceptedJul 4, 2016
  • PublishedNov 17, 2016

Abstract

In microblogs, authors use hashtags to mark keywords or topics. These manually labeled tags can be used to benefit various live social media applications (e.g., microblog retrieval, classification). However, because only a small portion of microblogs contain hashtags, recommending hashtags for use in microblogs are a worthwhile exercise. In addition, human inference often relies on the intrinsic grouping of words into phrases. However, existing work uses only unigrams to model corpora. In this work, we propose a novel phrase-based topical translation model to address this problem. We use the bag-of-phrases model to better capture the underlying topics of posted microblogs. We regard the phrases and hashtags in a microblog as two different languages that are talking about the same thing. Thus, the hashtag recommendation task can be viewed as a translation process from phrases to hashtags. To handle the topical information of microblogs, the proposed model regards translation probability as being topic specific. We test the methods on data collected from real-world microblogging services. The results demonstrate that the proposed method outperforms state-of-the-art methods that use the unigram model.


Acknowledgment

Acknowledgments

This work was partially funded by National Natural Science Foundation of China (Grant Nos. 61473092, 61472088, 61532011), National High Technology Research and Development Program of China (Grant No. 2015AA015408), and Shanghai Science and Technology Development Funds (Grant Nos. 13dz2260200, 13511504300).


References

[1] Bermingham A, Smeaton A F. Classifying sentiment in microblogs: is brevity an advantage? In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. New York: ACM, 2010. 1833--1836. Google Scholar

[2] Bollen J, Mao H, Zeng X. Twitter mood predicts the stock market. newblock J Comput Sci, 2011, 2: 1-8 CrossRef Google Scholar

[3] Pang B, Lee L. Opinion mining and sentiment analysis. newblock Found Trends Inf Retr, 2008, 2: 1-135 CrossRef Google Scholar

[4] Becker H, Naaman M, Gravano L. Learning similarity metrics for event identification in social media. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. New York: ACM, 2010. 291--300. Google Scholar

[5] Guy I, Avraham U, Carmel D, et al. Mining expertise and interests from social media. In: Proceedings of the 22nd International Conference on World Wide Web. New York: ACM, 2013. 515--526. Google Scholar

[6] Sakaki T, Okazaki M, Matsuo Y. Earthquake shakes twitter users: real-time event detection by social sensors. \linebreak In: Proceedings of the 19th International Conference on World Wide Web. New York: ACM, 2010. 851--860. Google Scholar

[7] Efron M. Hashtag retrieval in a microblogging environment. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2010. 787--788. Google Scholar

[8] Bandyopadhyay A, Mitra M, Majumder P. Query expansion for microblog retrieval. In: Proceedings of the 20th Text Retrieval Conference, TREC, 2011. Google Scholar

[9] Wang X, Wei F, Liu X, et al. Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. New York: ACM, 2011. 1031--1040. Google Scholar

[10] Bernhard D, Gurevych I. Combining lexical semantic resources with question & answer archives for translation-based answer finding. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Stroudsburg: Association for Computational Linguistics, 2009. 2: 728--736. Google Scholar

[11] Liu Z Y, Liang C, Sun M S. Topical word trigger model for keyphrase extraction. In: Proceedings of the 24th International Conference on Computational Linguistics, Mumbai, 2012. 1715--1730. Google Scholar

[12] Zhao W X, Jiang J, Weng J, et al. Comparing twitter and traditional media using topic models. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval. Berlin: Springer, 2011. 338--349. Google Scholar

[13] Diao Q M, Jiang J, Zhu F, et al. Finding bursty topics from microblogs. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2012. 536--544. Google Scholar

[14] Zhao W X, Jiang J, He J, et al. Topical keyphrase extraction from twitter. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Stroudsburg, 2011. 379--388. Google Scholar

[15] Ding Z Y, Qiu X, Zhang Q, et al. Learning topical translation model for microblog hashtag suggestion. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2013. 2078--2084. Google Scholar

[16] Ding Z, Zhang Q, Huang X. Automatic hashtag recommendation for microblogs using topic-specific translation model. In: Proceedings of the 24th International Conference on Computational Linguistics, Mumbai, 2012. 265. Google Scholar

[17] Chen K L, Chen T Q, Zheng G Q, et al. Collaborative personalized tweet recommendation. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2012. 661--670. Google Scholar

[18] Debnath S, Ganguly N, Mitra P. Feature weighting in content based recommendation system using social network analysis. In: Proceedings of the 17th International Conference on World Wide Web. New York: ACM, 2008. 1041--1042. Google Scholar

[19] Guy I, Zwerdling N, Ronen I, et al. Social media recommendation based on people and tags. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2010. 194--201. Google Scholar

[20] Konstas I, Stathopoulos V, Jose J M. On social networks and collaborative recommendation. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2009. 195--202. Google Scholar

[21] Pan Y, Cong F, Chen K, et al. Diffusion-aware personalized social update recommendation. In: Proceedings of the 7th ACM Conference on Recommender Systems. New York: ACM, 2013. 69--76. Google Scholar

[22] Ronen I, Guy I, Kravi E, et al. Recommending social media content to community owners. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2014. 243--252. Google Scholar

[23] Yan R, Lapata M, Li X. Tweet recommendation with graph co-ranking. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, Stroudsburg, 2012. 516--525. Google Scholar

[24] Chen W Y, Zhang D, Chang E Y. Combinational collaborative filtering for personalized community recommendation. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2008. 115--123. Google Scholar

[25] Lo S, Lin C. Wmr--a graph-based algorithm for friend recommendation. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, Hong Kong, 2006. 121--128. Google Scholar

[26] Ma H, King I, Lyu M R. Learning to recommend with social trust ensemble. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2009. 203--210. Google Scholar

[27] Moricz M, Dosbayev Y, Berlyant M. Pymk: friend recommendation at myspace. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. New York: ACM, 2010. 999--1002. Google Scholar

[28] Zhang W, Wang J, Feng W. Combining latent factor model with location features for event-based group recommendation. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2013. 910--918. Google Scholar

[29] Bu J J, Tan S L, Chen C, et al. Music recommendation by unified hypergraph: combining social media information and music content. In: Proceedings of the International Conference on Multimedia. New York: ACM, 2010. 391--400. Google Scholar

[30] Kaminskas M, Ricci F. Contextual music information retrieval and recommendation: state of the art and challenges. newblock Comput Sci Rev, 2012, 6: 89-119 CrossRef Google Scholar

[31] Schedl M, Schnitzer D. Hybrid retrieval approaches to geospatial music recommendation. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2013. 793--796. Google Scholar

[32] Li Q, Wang J, Chen Y P, et al. User comments for news recommendation in forum-based social media. newblock Inform Sci, 2010, 180: 4929-4939 CrossRef Google Scholar

[33] Shmueli E, Kagian A, Koren Y, et al. Care to comment? Recommendations for commenting on news stories. \linebreak In: Proceedings of the 21st International Conference on World Wide Web. New York: ACM, 2012. 429--438. Google Scholar

[34] Vasuki V, Natarajan N, Lu Z, et al. Affiliation recommendation using auxiliary networks. In: Proceedings of the 4th ACM Conference on Recommender Systems. New York: ACM, 2010. 103--110. Google Scholar

[35] Heymann P, Ramage D, Garcia-Molina H. Social tag prediction. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2008. 531--538. Google Scholar

[36] Krestel R, Fankhauser P, Nejdl W. Latent dirichlet allocation for tag recommendation. In: Proceedings of the 3rd ACM Conference on Recommender Systems. New York: ACM, 2009. 61--68. Google Scholar

[37] Rendle S, Marinho L, Nanopoulos A, et al. Learning optimal ranking with tensor factorization for tag recommendation. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2009. 727--736. Google Scholar

[38] Song Y, Zhuang Z, Li H, et al. Real-time automatic tag recommendation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2008. 515--522. Google Scholar

[39] Lu Y T, Yu S I, Chang T C, et al. A content-based method to enhance tag recommendation. In: Proceedings of the 21st International Jont Conference on Artifical Intelligence. San Francisco: Morgan Kaufmann Publishers Inc, 2009. 2064--2069. Google Scholar

[40] Tariq A, Karim A, Gomez F, et al. Exploiting topical perceptions over multi-lingual text for hashtag suggestion on twitter. In: Proceedings of the 26th International Florida Artificial Intelligence Research Society Conference, St. Pete Beach, 2013. 474--479. Google Scholar

[41] Griffiths T L, Steyvers M. Finding scientific topics. newblock Proc Natl Acad Sci, 2004, 101: 5228-5235 CrossRef Google Scholar

[42] El-Kishky A, Song Y, Wang C, et al. Scalable topical phrase mining from text corpora. newblock Proc VLDB Endowment, 2014, 8: 305-316 CrossRef Google Scholar

[43] Brown P F, Pietra V J D, Pietra S A D, et al. The mathematics of statistical machine translation: parameter estimation. newblock Comput Linguist, 1993, 19: 263-311 Google Scholar

[44] Och F J, Ney H. A systematic comparison of various statistical alignment models. newblock Comput Linguist, 2003, 29: 19-51 CrossRef Google Scholar

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1       京公网安备11010102003388号