logo

SCIENTIA SINICA Informationis, Volume 47, Issue 11: 1551-1565(2017) https://doi.org/10.1360/N112016-00262

Link prediction in online social networks based on supervised joint denoising model

More info
  • ReceivedMar 29, 2017
  • AcceptedJun 26, 2017
  • PublishedNov 10, 2017

Abstract

Link prediction of social networks can capture important information about missing links for applications in many fields. Because of the failure to make full use of information as well as capture all properties, the link prediction precision of most of methods is low. For higher precision, we propose a novel algorithm, a supervised joint denoising model (SJDM) that formulates the link prediction problem as a supervised matrix “denoising problem. The central piece of our method is a function that is trained using the features of users and topological structures of social networks. The function can map the observed “corrupted matrix to an “uncorrupted matrix (target matrix). We performed community detection using the target matrix, which is better than using the original matrix. Five real networks are processed with this algorithm. The results show that SJDM algorithm is more efficient compared to the other four algorithms.


Funded by

国家自然科学基金(71672103,61572104,71501113)

山东省自然科学基金(ZR2017MG022)


Supplement

Appendix


References

[1] Wang P, Xu B W, Wu Y R, et al. Link prediction in social networks: the state-of-the-art. Sci China Inf Sci, 2015, 58: 011101. Google Scholar

[2] Aiello L M, Barrat A, Schifanella R, et al. Friendship prediction and homophily in social media. ACM Trans Web, 2012, 6: 1--33. Google Scholar

[3] Mori J, Kajikawa Y, Kashima H. Machine learning approach for finding business partners and building reciprocal relationships. Expert Syst Appl, 2012, 39: 10402-10407 CrossRef Google Scholar

[4] Wu S, Sun J, Tang J. Patent partner recommendation in enterprise social networks. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM13), Rome, 2013. 43--52. Google Scholar

[5] Tang J, Wu S, Sun J M, et al. Cross-domain collaboration recommendation. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD12), Beijing, 2012. 1285--1293. Google Scholar

[6] Pavlov M, Ichise R. Finding experts by link prediction in co-authorship networks. In: Proceedings of the 2nd International ISWC+ASWC Workshop on Finding Experts on the Web with Semantics (FEWS), Busan, 2007. 42--55. Google Scholar

[7] Wohlfarth T, Ichise R. Semantic and event-based approach for link prediction. In: Proceedings of the 7th International Conference on Practical Aspects of Knowledge Management (PAKM08), Yokohama, 2008. 50--61. Google Scholar

[8] Bhattacharyya P, Garg A, Wu S F. Analysis of user keyword similarity in online social networks. Soc Netw Anal Min, 2011, 1: 143-158 CrossRef Google Scholar

[9] Akcora C G, Carminati B, Ferrari E. User similarities on social networks. Soc Netw Anal Min, 2013, 3: 475-495 CrossRef Google Scholar

[10] Anderson A, Huttenlocher D, Kleinberg J, et al. Effects of user similarity in social media. In: Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM12), Seattle, 2012. 703--712. Google Scholar

[11] Adamic L A, Adar E. Friends and neighbors on the Web. Social Networks, 2003, 25: 211-230 CrossRef Google Scholar

[12] Lichtenwalter R N, Chawla N V. Vertex collocation profiles: subgraph counting for link analysis and prediction. In: Proceedings of the 21st World Wide Web Conference (WWW12), Lyon, 2012. 1019--1028. Google Scholar

[13] Symeonidis P, Mantas N. Spectral clustering for link prediction in social networks with positive and negative links. Soc Netw Anal Min, 2013, 3: 1433-1447 CrossRef Google Scholar

[14] Valverde-Rebaza J, de Andrade Lopes A. Exploiting behaviors of communities of twitter users for link prediction. Soc Netw Anal Min, 2013, 3: 1063-1074 CrossRef Google Scholar

[15] Liu H, Hu Z, Haddadi H. Hidden link prediction based on node centrality and weak ties. EPL, 2013, 101: 18004 CrossRef ADS Google Scholar

[16] Zhu J. Max-margin nonparametric latent feature models for link prediction. In: Proceedings of the 29th International Coference on Machine Learning, Edinburgh, 2012. 1179--1186. Google Scholar

[17] Lloyd J R, Orbanz P, Ghahramani Z, et al. Random function priors for exchangeable arrays with applications to graphs and relational data. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, 2012. 1007--1015. Google Scholar

[18] Chen M M, Xu Z X, Weinberger K Q, et al. Marginalized denoising autoenco-ders for domain adaptation. In: Proceedings of International Conference on Machine Learning 2012 (ICML2012), Edinburgh, 2012. Google Scholar

[19] Papadimitriou A, Symeonidis P, Manolopoulos Y. Fast and accurate link prediction in social networking systems. J Syst Software, 2012, 85: 2119-2132 CrossRef Google Scholar

[20] Symeonidis P, Iakovidou N, Mantas N. From biological to social networks: Link prediction based on multi-way spectral clustering. Data Knowledge Eng, 2013, 87: 226-242 CrossRef Google Scholar

[21] Bastani S, Jafarabad A K, Zarandi M H F. Fuzzy Models for Link Prediction in Social Networks. Int J Intell Syst, 2013, 28: 768-786 CrossRef Google Scholar

[22] Bliss C A, Frank M R, Danforth C M. An evolutionary algorithm approach to link prediction in dynamic social networks. J Comp Sci, 2014, 5: 750-764 CrossRef Google Scholar

[23] Bayrak A E, Polat F. Contextual feature analysis to improve link prediction for location based social networks. In: Proceedings of the 8th Workshop on Social Network Mining and Analysis (SNAKDD14), New York, 2014. Google Scholar

[24] Xie X Q, Li Y J, Zhang Z Q, et al. A joint link prediction method for social network. In: Proceedings of Intelligent Computation in Big Data Era, Harbin, 2015. 56--64. Google Scholar

[25] Zhu J, Xie Q, Chin E J. A hybrid time-series link prediction framework for large social network. In: Proceedings of the 23rd Database and Expert Systems Applications (DEXA2012), Vienna, 2012. 345--359. Google Scholar

[26] Liu F, Liu B Q, Sun C J, et al. Deep learning approaches for link prediction in social network services. In: Proceedings of the 20th International Conference on Neural Information Processing (ICONIP2013), Daegu, 2013. 425--432. Google Scholar

[27] Gong N Z, Talwalkar A, Mackey L, et al. Joint link prediction and attribute inference using a social-attribute network. ACM Trans Intel Syst Technol, 2014, 5: 529--544. Google Scholar

[28] Panwar A P S, Niyogi R. A heuristic for link prediction in online social network. In: Intelligent Distributed Computing. Berlin: Springer, 2015. 31--41. Google Scholar

[29] He Y, Liu J N K, Hu Y. OWA operator based link prediction ensemble for social network. Expert Syst Appl, 2015, 42: 21-50 CrossRef Google Scholar

[30] Sherkat E, Rahgozar M, Asadpour M. Structural link prediction based on ant colony approach in social networks. Physica A-Statistical Mech its Appl, 2015, 419: 80-94 CrossRef ADS Google Scholar

[31] Nguyen-Thi A T, Nguyen P Q, Ngo T D. Transfer AdaBoost SVM for Link Prediction in Newly Signed Social Networks using Explicit and PNR Features. Procedia Comp Sci, 2015, 60: 332-341 CrossRef Google Scholar

[32] Hoff P D. Modeling homophily and stochastic equivalence in symmetric relational data,. arXiv Google Scholar

[33] Vincent P, Larochelle H, Bengio Y, et al. Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine learning, Helsinki, 2008. 1096--1103. Google Scholar

[34] McAuley J, Leskovec J. Learning to discover social circles in ego networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, 2012. 539--547. Google Scholar

[35] Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks.. Nat Meth, 2012, 9: 471-472 CrossRef PubMed Google Scholar

  • Figure 1

    Example of link prediction

  • Figure 2

    Flow chart of SJDM algorithm

  • Figure 3

    The average AUC score and training time of SJDM in (a) low-rank matrix, (b) original matrix

  • Figure 4

    The AUC score of different latent dimensions $k$. (a) Facebook, Twitter, Gplus; (b) Pokec

  • Figure 5

    The AUC score of different corruption $p$. (a) Facebook, Twitter, Gplus; (b) Pokec

  • Figure 6

    The objective value with regard to the number of iterations for gradient descent in Facebook

  • Figure 7

    The objective value with regard to the number of iterations for gradient descent in Twitter

  • Figure 8

    The objective value with regard to the number of iterations for gradient descent in Gplus

  • Figure 9

    The objective value with regard to the number of iterations for gradient descent in Pokec

  • Table 1   Formula description table
    Formula number Formula description
    (1) The initial function of the objective function
    (2) User similarity function
    (3) The matrix representation of formula (2)
    (4) Link formation based on close relation matrix
    (5) Matrix reconstruction based on formula (4)
    (6) The matrix representation of formula (5)
    (7) Matrix low rank approximation
    (8) The concrete objective function
    (9) Transformation of objective functions based on weak law of large numbers
  • Table 1   The social network properties of the four datasets evaluated
    Social network Nodes Links Density
    Facebook 2634 63557 0.01833
    Twitter 4292 142690 0.01550
    Gplus 4712 377848 0.03404
    Pokec 12781 80920 0.00099
  • Table 2   Parameters of SJDM algorithm
    Social network $\alpha$ $\beta$ $p$ $k$
    Facebook 0.1 0.9 0.92 100
    Twitter 0.1 0.9 0.92 100
    Gplus 0.1 0.9 0.92 100
    Pokec 0.1 0.9 0.95 200
  • Table 3   The AUC scores of six algorithms tested on four datasets
    Facebook Gplus Twitter Pokec
    Spectralink $0.8102\pm~0.0011$ $0.8014\pm0.0016$ $0.8201\pm0.0014$ $0.7926\pm0.0013$
    CMA-ES $0.8597\pm0.0009$ $0.8368\pm0.0019$ $0.8586\pm0.0011$ $0.8211\pm0.0017$
    SLP-SAN-VI $~0.8855\pm0.0030$ $0.8781\pm0.0023$ $0.8913\pm0.0025$ $0.8812\pm0.0015$
    TAS+PNR $0.9012\pm0.0017$ $0.8890\pm0.0010$ $0.9088\pm0.0012$ $0.8795\pm0.0024$
    MDA $0.9189\pm0.0016$ $0.9035\pm0.0011$ $0.9168\pm0.0009$ $0.8966\pm0.0013$
    SJDM $0.9564\pm0.0021$ $0.9373\pm0.0003$ $0.9587\pm0.0006$ $0.9250\pm0.0031$
  • Table 4   Training time ratio of low-rank matrix to original matrix
    Social network Time radio (Low rank matrix training time / Original matrix training time) (%)
    Facebook 5.08
    Twitter 3.82
    Gplus 3.86
    Pokec 0.46

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1       京公网安备11010102003388号