logo

SCIENTIA SINICA Informationis, Volume 48, Issue 3: 329-348(2018) https://doi.org/10.1360/N112017-00064

Data placement approach for scalable online social networks

More info
  • ReceivedMar 28, 2017
  • AcceptedOct 24, 2017
  • PublishedFeb 13, 2018

Abstract

Online social networks are attracting more and more users. Faced with hundreds of millions of users, how to store user data in a scalable manner has become a hot issue of focus for both social service providers and researchers. Currently, distributed key value store is widely used; it places user data across multiple storage servers based on a hash approach. However, it results in a huge amount of communication traffic inside a data center, and is not conducive to the scale of social networks. By considering user interaction characteristics, this paper proposes a data placement approach that combines both social graph partitioning and data replication. Considering the network topologies of data centers, we design the data placement for specific topologies. Furthermore, we discuss the incremental adjustment for social network growth and the distributed implementation of the proposed algorithms. Finally, experiments on real world traces indicate that the proposed algorithms can effectively reduce internal communication traffic, thereby enhancing the scalability of online social networks.


Funded by

国家自然科学基金(61502328,61572337,61672370)

江苏省产学研联合创新资金前瞻性研究(BY2014059-02)

江苏省高校自然科学研究基金(15KJB520032)

江苏省博士后科研资助计划(1701173B)


References

[1] Shvachko K, Kuang H, Radia R, et al. The hadoop distributed file system. In: Proceedings of the 26th Symposium on Mass Storage Systems and Technologies, Nevada, 2010. 1--10. Google Scholar

[2] DeCandia G, Hastorun D, Jampani M, et al. Dynamo: amazon's highly available key-value store. In: Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles, Washington, 2007. 205--220. Google Scholar

[3] Lakshman A, Malik P. Cassandra: a decentralized structured storage system. Operat Syst Rev, 2010, 44: 35--40. Google Scholar

[4] Karypis G, Kumar V. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J Sci Comput, 1998, 20: 359-392 CrossRef Google Scholar

[5] Chen H, Jin H, Wu S. Minimizing Inter-Server Communications by Exploiting Self-Similarity in Online Social Networks. IEEE Trans Parallel Distrib Syst, 2016, 27: 1116-1130 CrossRef Google Scholar

[6] Pujol J M, Erramilli V, Siganos G. The Little Engine(s) That Could: Scaling Online Social Networks. IEEE/ACM Trans Networking, 2012, 20: 1162-1175 CrossRef Google Scholar

[7] Tran D A, Nguyen K, Pham C. S-CLONE: Socially-aware data replication for social networks. Comput Networks, 2012, 56: 2001-2013 CrossRef Google Scholar

[8] Liu G, Shen H, Chandler H. Selective Data Replication for Online Social Networks with Distributed Datacenters. IEEE Trans Parallel Distrib Syst, 2016, 27: 2377-2393 CrossRef Google Scholar

[9] Jiao L, Li J, Du W, et al. Multi-objective data placement for multi-cloud socially aware services. In: Proceedings of IEEE Conference on Computer Communications, Toronto, 2014. 28--36. Google Scholar

[10] Tran D A, Zhang T. S-put: an ea-based framework for socially aware data partitioning. Comput Netw, 2014, 504--518. Google Scholar

[11] Yu B Y, Pan J P. Location-aware associated data placement for geo-distributed data-intensive applications. In: Proceedings of IEEE Conference on Computer Communications, Hong Kong, 2015. 603--611. Google Scholar

[12] Yu B Y, Pan J P. Sketch-based data placement among geo-distributed datacenters for cloud storages. In: Proceedings of the 35th Annual IEEE International Conference on Computer Communications, San Francisco, 2016. 1--9. Google Scholar

[13] Tang J, Tang X Y, Yuan J S. Optimizing inter-server communication for online social networks. In: Proceedings of the 35th International Conference on Distributed Computing Systems, Columbus, 2015. 215--224. Google Scholar

[14] Jiao L, Li J, Xu T. Optimizing Cost for Online Social Networks on Geo-Distributed Clouds. IEEE/ACM Trans Networking, 2016, 24: 99-112 CrossRef Google Scholar

[15] Gregory S. Finding overlapping communities in networks by label propagation. New J Phys, 2010, 12: 103018 CrossRef ADS arXiv Google Scholar

[16] Lancichinetti A, Fortunato S, Kertesz J. Detecting the overlapping and hierarchical community structure in complex networks. New J Phys, 2009, 11: 1--18. Google Scholar

[17] Lee C, Reid F, McDaid A, et al. Detecting highly overlapping community structure by greedy clique expansion. 2010,. arXiv Google Scholar

[18] Han Z M, Tan X S, Chen Y, et al. NCSS: an effective and efficient complex network community detection algorithm. Sci Sin Inform, 2016, 46: 431--444. Google Scholar

[19] Qiao S J, Han N, Zhang K F, et al. Algorithm for detecting overlapping communities from complex network big data. J Softw, 2017, 28: 631--647. Google Scholar

[20] Benevenuto F, Rodrigues T, Cha M, et al. Characterizing user behavior in online social networks. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement, Chicago, 2009. 49--62. Google Scholar

[21] Li D, Chen G H, Ren F Y, et al. Data center network research progress and trends. Chinese J Comput, 2014, 37: 259--274. Google Scholar

[22] Al-Fares M, Loukissas A, Vahdat A. A scalable, commodity data center network architecture. In: Proceedings of ACM SIGCOMM Conference on Data Communication, Seattle, 2008. 63--74. Google Scholar

[23] Clos C. A Study of Non-Blocking Switching Networks. Bell Syst Technical J, 1953, 32: 406-424 CrossRef Google Scholar

[24] Roy A, Zeng H Y, Bagga J, et al. Inside the social network's (datacenter) network. In: Proceedings of ACM SIGCOMM Conference on Data Communication, London, 2015. 123--137. Google Scholar

[25] Wilson C, Sala A, Puttaswamy K P N, et al. Beyond social graphs: user interactions in online social networks and their implications. ACM Trans Web, 2012, 6: 1--31. Google Scholar

[26] Jiang J, Wilson C, Wang X, et al. Understanding latent interactions in online social networks. ACM Trans Web, 2013, 7: 1--39. Google Scholar

[27] Catalyurek U V, Aykanat C. PaToH (partitioning tool for hypergraphs). In: Encyclopedia of Parallel Computing. Berlin: Springer, 2011. 1479--1487. Google Scholar

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1