logo

SCIENCE CHINA Information Sciences, Volume 61, Issue 11: 112103(2018) https://doi.org/10.1007/s11432-017-9334-3

Correlations multiplexing for link prediction in multidimensional network spaces

More info
  • ReceivedMay 16, 2017
  • AcceptedDec 28, 2017
  • PublishedJun 8, 2018

Abstract

In social networks, link establishment among users is affected by diversity correlations. In this paper, we study the formation of links, map correlations into multidimensional network spaces and apply their behavioral and structural features to the problem of link prediction. First, by exploring user behavioral correlation and network structural correlation, we map them into three network spaces: following space, interaction space and structure space. With a hierarchical process, the coupling relationship between the spaces can be reduced and we can analyze the correlation in each space separately. Second, by taking advantage of the latent Dirichlet allocation (LDA) topic model for dealing with the polysemy and synonym problems, the traditional text modeling method is improved by Gaussian weighting and applied to user behavior modeling. In this way, the expression ability of topics can be enhanced, and improved topic distribution of user behavior can be obtained to mine user correlations in the following space and the interaction space. Moreover, the method can be extended using the hidden naive Bayesian algorithm which is good at reducing attribute independence. By quantifying the dependencies between common neighbors, we can analyze user correlations in the structure space and multiplex the correlations of the other two spaces to link prediction. The experimental results indicate that the method can effectively improve link prediction performance.


References

[1] Liben-Nowell D, Kleinberg J. The link-prediction problem for social networks. J Am Soc Inf Sci, 2007, 58: 1019-1031 CrossRef Google Scholar

[2] Getoor L, Diehl C P. Link mining: a survey. ACM SIGKDD Explor Newslett, 2005, 7: 3--12. Google Scholar

[3] Lü L, Zhou T. Link prediction in complex networks: A survey. Physica A-Statistical Mech its Appl, 2011, 390: 1150-1170 CrossRef ADS arXiv Google Scholar

[4] Gong N Z, Talwalkar A, Mackey L, et al. Joint link prediction and attribute inference using a social-attribute network. ACM Trans Intell Syst Technol, 2014, 5: 1--20. Google Scholar

[5] Ding J, Jiao L, Wu J. Prediction of missing links based on community relevance and ruler inference. Knowledge-Based Syst, 2016, 98: 200-215 CrossRef Google Scholar

[6] Wang H H, Raza A A, Lin Y B, et al. Behavior analysis of low-literate users of a viral speech-based telephone service. In: Proceedings of the 4th Annual Symposium on Computing for Development, Cape Town, 2013. Google Scholar

[7] Agarwal V, Bharadwaj K K. A collaborative filtering framework for friends recommendation in social networks based on interaction intensity and adaptive user similarity. Soc Netw Anal Min, 2013, 3: 359-379 CrossRef Google Scholar

[8] Mart'ınez V, Cano C, Blanco A. ProphNet: a generic prioritization method through propagation of information. BMC Bioinformatics, 2014, 15: 1506--1526. Google Scholar

[9] Wang J, Sun J, Lin H. Convolutional neural networks for expert recommendation in community question answering. Sci China Inf Sci, 2017, 60: 110102 CrossRef Google Scholar

[10] Ermi? B, Acar E, Cemgil A T. Link prediction in heterogeneous data via generalized coupled tensor factorization. Data Min Knowl Disc, 2015, 29: 203-236 CrossRef Google Scholar

[11] Sherkat E, Rahgozar M, Asadpour M. Structural link prediction based on ant colony approach in social networks. Physica A-Statistical Mech its Appl, 2015, 419: 80-94 CrossRef ADS Google Scholar

[12] Hannon J, Bennett M, Smyth B. Recommending twitter users to follow using content and collaborative filtering approaches. In: Proceedings of the 4th ACM Conference on Recommender Systems, Barcelona, 2010. 199--206. Google Scholar

[13] Jiang B, Liang J G, Sha Y, et al. Domain dictionary-based topic modeling for social text. In: Proceedings of International Conference on Web Information Systems Engineering, Shanghai, 2016. 109--123. Google Scholar

[14] Wagner C, Singer P, Strohmaier M. Semantic Stability and Implicit Consensus in Social Tagging Streams. IEEE Trans Comput Soc Syst, 2014, 1: 108-120 CrossRef Google Scholar

[15] Nguyen L, Vo B, Hong T P, et al. Interestingness measures for classification based on association rules. In: Proceedings of International Conference on Computational Collective Intelligence, Ho Chi Minh City, 2012. 383--392. Google Scholar

[16] Wang P, Xu B W, Wu Y R, et al. Link prediction in social networks: the state-of-the-art. Sci China Inf Sci, 2015, 58: 011101. Google Scholar

[17] Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. J Mach Learn Res, 2003, 3: 993--1022. Google Scholar

[18] Cho Y S, Ver Steeg G, Ferrara E, et al. Latent space model for multi-modal social data. In: Proceedings of the 25th International Conference on World Wide Web, Montréal, 2016. 447--458. Google Scholar

[19] Bliss C A, Frank M R, Danforth C M. An evolutionary algorithm approach to link prediction in dynamic social networks. J Comput Sci, 2014, 5: 750-764 CrossRef Google Scholar

[20] Sett N, Ranbir Singh S, Nandi S. Influence of edge weight on node proximity based link prediction methods: An empirical analysis. Neurocomputing, 2016, 172: 71-83 CrossRef Google Scholar

[21] Liu Y C, Tong H H, Xie L, et al. Supervised link prediction using random walks. In: Proceedings of Chinese National Conference on Social Media Processing, Guangzhou, 2015. 107--118. Google Scholar

[22] Gong J, Gao X, Cheng H. Integrating a weighted-average method into the random walk framework to generate individual friend recommendations. Sci China Inf Sci, 2017, 60: 110104 CrossRef Google Scholar

[23] Wang T, Krim H, Viniotis Y. A Generalized Markov Graph Model: Application to Social Network Analysis. IEEE J Sel Top Signal Process, 2013, 7: 318-332 CrossRef ADS Google Scholar

[24] Mossel E, Sly A, Tamuz O. Asymptotic learning on Bayesian social networks. Probab Theor Relat Fields, 2014, 158: 127-157 CrossRef Google Scholar

[25] Li F, He J, Huang G. Node-coupling clustering approaches for link prediction. Knowledge-Based Syst, 2015, 89: 669-680 CrossRef Google Scholar

[26] Martínez V, Berzal F, Cubero J C. Adaptive degree penalization for link prediction. J Comput Sci, 2016, 13: 1-9 CrossRef Google Scholar

[27] Pizzato L, Rej T, Akehurst J. Recommending people to people: the nature of reciprocal recommenders with a case study in online dating. User Model User-Adap Inter, 2013, 23: 447-488 CrossRef Google Scholar

[28] Scellato S, Noulas A, Mascolo C. Exploiting place features in link prediction on location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, 2011. 1046--1054. Google Scholar

[29] Shahmohammadi A, Khadangi E, Bagheri A. Presenting new collaborative link prediction methods for activity recommendation in Facebook. Neurocomputing, 2016, 210: 217-226 CrossRef Google Scholar

[30] Liu J, Xu B, Xu X. A link prediction algorithm based on label propagation. J Comput Sci, 2016, 16: 43-50 CrossRef Google Scholar

[31] Althoff T, Jindal P, Leskovec J. Online actions with offline impact: how online social networks influence online and offline user behavior. In: Proceedings of the 10th ACM International Conference on Web Search and Data Mining, Portland, 2016. 537--546. Google Scholar

[32] Li L, He J P, Wang M, et al. Trust agent-based behavior induction in social networks. IEEE Intell Syst, 2016, 31: 24--30. Google Scholar

[33] Cha Y, Cho J. Social-network analysis using topic models. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, 2012. 565--574. Google Scholar

[34] Chang J, Blei D M. Relational topic models for document networks. AIStats, 2009, 9: 81--88. Google Scholar

[35] Liangxiao Jiang , Zhang H, Zhihua Cai H. A Novel Bayes Model: Hidden Naive Bayes. IEEE Trans Knowl Data Eng, 2009, 21: 1361-1371 CrossRef Google Scholar

[36] Jiang L, Zhang H, Cai Z. Weighted average of one-dependence estimators?. J Exp Theor Artificial Intelligence, 2012, 24: 219-230 CrossRef Google Scholar

[37] Jiang L, Wang S, Li C. Structure extended multinomial naive Bayes. Inf Sci, 2016, 329: 346-356 CrossRef Google Scholar

[38] Zhao F, Zhu Y, Jin H. A personalized hashtag recommendation approach using LDA-based topic model in microblog environment. Future Generation Comput Syst, 2016, 65: 196-206 CrossRef Google Scholar

[39] De Domenico M, Lima A, Mougel P. The Anatomy of a Scientific Rumor. Sci Rep, 2013, 3: 2980 CrossRef PubMed ADS arXiv Google Scholar

[40] Waitelonis J, Exeler C, Sack H. Linked data enabled generalized vector space model to improve document retrieval. In: Proceedings of NLP& DBpedia 2015 Workshop at the 14th International Semantic Web Conference CEUR-WS, Bethlehem, 2015. Google Scholar

[41] Jiang B, Liang J G, Sha Y, et al. Retweeting behavior prediction based on one-class collaborative filtering in social networks. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, 2016. 977--980. Google Scholar

  • Figure 1

    (Color online) Link prediction framework.

  • Figure 2

    (Color online) Multidimensional network spaces.

  • Figure 3

    Graphic model.

  •   
    Symbols Description Symbols Description
    $N$ Number of the users in $G$ ${f_{x,n}},{i_{x,n}}$ The $n$-th followed or interacted user of ${u_x}$
    ${N_x},N_x'$ Number of followed or interacted users about $u_x$ ${l_{xy}}$ Existence of link between $u_x$ and $u_y$
    ${N_f},{N_i}$ Number of followed or interacted users in $G$ ${s_q}$ The $q$-th common neighbor between $u_x$ and $u_y$
    $K$ Number of interest topics ${\eta~_q}$ Independent dependency weight of the $q$-th common neighbor between $u_x$ and $u_y$
    ${\gamma~_i},\gamma~_i'$ Gaussian weight of the $i$-th followed or interacted user in $G$ ${\pi~_q}$ Joint dependency weight of the $q$-th common neighbor between $u_x$ and $u_y$
    ${Q_{xy}}$ Number of common neighbors between $u_x$ and $u_y$ ${R_1}$ Set of correlation in following space ${G^f}$
    ${\boldsymbol{\alpha~}},{\boldsymbol{\beta~}},{{\boldsymbol{\alpha~}}'},{{\boldsymbol{\beta~}}'}$ Dirichlet priors ${R_2}$ Set of correlation in interaction space ${G^i}$
    ${{\boldsymbol{\theta~}}_{\boldsymbol{x}}},{\boldsymbol{\theta~}}'_{\boldsymbol{x}}$ A topic distribution of $u_x$ ${R_3}$ Set of correlation in structure space ${G^s}$
    ${{\boldsymbol{\psi~}}_{\boldsymbol{k}}},{\boldsymbol{\psi~}}'_{\boldsymbol{k}}$ A behavior distribution of topic $k$ $\tau~$ Learning rate for driving mechanism
    ${z_{x,n}},z_{x,n}'$ The $n$-th interest topic of $u_x$ $Y$ Set of the existence of links in $G$
  •   
    Dataset Sina micro-blog Twitter
    Users 49556 97632
    Relationships 61880 6883144
    Activity 3057635 167420
    Forward 506765237 91142
    Review 185079821 62986
  •   
    Algorithm Accuracy Precision Recall F1-measure
    LDA 0.716 0.719 0.626 0.669
    CN+LDA 0.820 0.857 0.634 0.729
    RA+LDA 0.826 0.861 0.638 0.733
    VSM 0.834 0.839 0.605 0.703
    CF 0.831 0.836 0.592 0.693
    RW 0.843 0.877 0.612 0.721
    HNB 0.847 0.881 0.608 0.719
    WAODE 0.851 0.893 0.629 0.738
    Proposed method ($K{\rm{~=~}}10$) 0.874 0.913 0.649 0.759
    Proposed method ($K{\rm{~=~}}15$) 0.858 0.895 0.636 0.744
  •   
    Algorithm Accuracy Precision Recall F1-measure
    LDA 0.709 0.725 0.719 0.722
    CN+LDA 0.775 0.801 0.743 0.771
    RA+LDA 0.783 0.809 0.751 0.779
    VSM 0.768 0.792 0.735 0.762
    CF 0.761 0.784 0.726 0.754
    RW 0.796 0.813 0.754 0.782
    HNB 0.809 0.828 0.763 0.794
    WAODE 0.818 0.834 0.793 0.813
    Proposed method ($K{\rm{~=~}}10$) 0.836 0.848 0.784 0.825
    Proposed method ($K{\rm{~=~}}15$) 0.821 0.831 0.774 0.803

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1