logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 10 : 203101(2020) https://doi.org/10.1007/s11432-020-2952-7

Graph algorithms: parallelization and scalability

More info
  • ReceivedApr 1, 2020
  • AcceptedJun 10, 2020
  • PublishedSep 21, 2020

Abstract

For computations on large-scale graphs, one often resorts toparallel algorithms. However, parallel algorithms are difficultto write,debug and analyze. Worse still, it is difficultto make algorithmsparallelly scalable, such that the more machines are usedthe faster the algorithms run. Indeed, it is not yet knownwhether any mathsf PTIMExspace computational problem sadmitparallelly scalable algorithms on shared-nothing systems.Is it possible to parallelize sequential graph algorithmsand guarantee convergence at thecorrect resultsas long as the sequential algorithms are correct?Moreover, does a mathsf PTIMExspace parallelly scalableproblem existon shared-nothing systems?This position paper answers both questions in the affirmative.


Acknowledgment

This work was supported in part by Shenzhen Institute of Computing Sciences, Beijing Advanced Innovation Center for Big Data and Brain Computing (Beihang University), Royal Society Wolfson Research Merit Award (Grant No. WRM/R1/180014), European Research Council (Grant No. 652976), and Engineering and Physical Sciences Research Council (Grant No. EP/M025268/1).


References

[1] Malewicz G, Austern M H, Bik A J C, et al. Pregel: a system for large-scale graph processing. In: Proceedings of International Conference on Management of Data, 2010. Google Scholar

[2] Low Y, Bickson D, Gonzalez J. Distributed GraphLab. Proc VLDB Endow, 2012, 5: 716-727 CrossRef Google Scholar

[3] Tian Y, Balmin A, Corsten S A. From "think like a vertex" to "think like a graph". Proc VLDB Endow, 2013, 7: 193-204 CrossRef Google Scholar

[4] Wang G Z, Xie W L, Demers A J, et al. Asynchronous large-scale graph processing made easy. In: Proceedings of Conference on Innovative Data Systems Research, 2013. Google Scholar

[5] Xie C N, Chen R, Guan H B, et al. SYNC or ASYNC: time to fuse for distributed graph-parallel computation. In: Proceedings of ACM Sigplan Symposium on Principles and Practice of Parallel Programming, 2015. Google Scholar

[6] Kruskal C P, Rudolph L, Snir M. A complexity theory of efficient parallel algorithms. Theor Comput Sci, 1990, 71: 95-132 CrossRef Google Scholar

[7] Fan W, Wang X, Wu Y. Association rules with graph patterns. Proc VLDB Endow, 2015, 8: 1502-1513 CrossRef Google Scholar

[8] Fan W F, Hu C M, Liu X L, et al. Discovering graph functional dependencies. In: Proceedings of International Conference on Management of Data, 2018. 427--439. Google Scholar

[9] Fan W, Lu P, Tian C. Deducing certain fixes to graphs. Proc VLDB Endow, 2019, 12: 752-765 CrossRef Google Scholar

[10] Fan W F, Wang X, Wu Y H, et al. Distributed graph simulation: impossibility and possibility. Proc VLDB Endow, 2013, 7: 1083--1094. Google Scholar

[11] Papadimitriou C H. Computational Complexity. Boston: Addison-Wesley, 1994. Google Scholar

[12] Fan W F, Yu W Y, Xu J B, et al. Parallelizing sequential graph computations. In: Proceedings of International Conference on Management of Data, 2017. 495--510. Google Scholar

[13] Fan W F, Lu P, Luo X J, et al. Adaptive asynchronous parallelization of graph algorithms. In: Proceedings of International Conference on Management of Data, 2018. 1141--1156. Google Scholar

[14] Yan D, Bu Y, Tian Y. Big Graph Analytics Platforms. FNT Databases, 2015, 7: 1-195 CrossRef Google Scholar

[15] Raychev V, Musuvathi M, Mytkowicz T. Parallelizing user-defined aggregations using symbolic execution. In: Proceedings of Symposium on Operating Systems Principles, 2015. Google Scholar

[16] Pingali K, Nguyen D, Kulkarni M, et al. The tao of parallelism in algorithms. In: Proceedings of Programming Language Design and Implementation, 2011. Google Scholar

[17] Zhou Y, Liu L, Lee K, et al. Fast iterative graph computation with resource aware graph parallel abstractions. In: Proceedings of High Performance Distributed Computing, 2015. Google Scholar

[18] Radoi C, Fink S J, Rabbah R M, et al. Translating imperative code to MapReduce. In: Proceedings of Conference on Object-Oriented Programming Systems, Languages, and Applications. 2014. Google Scholar

[19] JáJá J. An Introduction to Parallel Algorithms. Boston: Addison-Wesley, 1992. Google Scholar

[20] Karp R M, Ramachandran V. Parallel algorithms for shared-memory machines. In: Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, 869--942, 1990. Google Scholar

[21] Chong K W, Han Y J, Lam T W. On the parallel time complexity of undirected connectivity and minimum spanning trees. In: Proceedings of Symposium on Discrete Algorithms, 1999. 225--234. Google Scholar

[22] Tarjan R E, Vishkin U. An Efficient Parallel Biconnectivity Algorithm. SIAM J Comput, 1985, 14: 862-874 CrossRef Google Scholar

[23] Chong K W, Nikolopoulos S D, Palios L. An Optimal Parallel Co-Connectivity Algorithm. Theor Comput Syst, 2004, 37 CrossRef Google Scholar

[24] Brent R P. The Parallel Evaluation of General Arithmetic Expressions. J ACM, 1974, 21: 201-206 CrossRef Google Scholar

[25] Spencer T H. Time-work tradeoffs for parallel algorithms. J ACM, 1997, 44: 742-778 CrossRef Google Scholar

[26] Harris T J. A survey of PRAM simulation techniques. ACM Comput Surv, 1994, 26: 187-206 CrossRef Google Scholar

[27] Herley K T, Bilardi G. SIAM J Comput, 1994, 23: 276-292 CrossRef Google Scholar

[28] Karloff H J, Suri S, Vassilvitskii S. A model of computation for MapReduce. In: Proceedings of Symposium on Discrete Algorithms, 2010. 938--948. Google Scholar

[29] Goodrich M T, Sitchinava N, Zhang Q. Sorting, searching, and simulation in the MapReduce framework. In: Proceedings of International Symposium on Algorithms and Computation, 2011. 374--383. Google Scholar

[30] Yan D, Cheng J, Xing K. Pregel algorithms for graph connectivity problems with performance guarantees. Proc VLDB Endow, 2014, 7: 1821-1832 CrossRef Google Scholar

[31] Andreev K, Racke H. Balanced graph partitioning. In: Proceedings of ACM Symposium on Parallel Algorithms and Architectures, 2006. Google Scholar

[32] Bourse F, Lelarge M, Vojnovic M. Balanced graph edge partition. In: Proceedings of Knowledge Discovery and Data Mining, 2014. 1456--1465. Google Scholar

[33] Fan W F, Liu M Y, Xu R Q, et al. Think sequential, run parallel. In: Proceedings of Symposium on Real-Time and Hybrid Systems — Essays Dedicated to Prof. Chaochen Zhou on the Occasion of His 80th Birthday, 2018. Google Scholar

[34] Fan W F, Jin R C, Liu M Y, et al. Application driven graph partitioning. In: Proceedings of ACM SIGMOD International Conference on Management of Data, 2020. Google Scholar

[35] Henzinger M R, Henzinger T, Kopke P. Computing simulations on finite and infinite graphs. In: Proceedings of Foundations of Computer Science, 1995. Google Scholar

[36] Fan W, Li J, Ma S. Graph pattern matching. Proc VLDB Endow, 2010, 3: 264-275 CrossRef Google Scholar

[37] Fan W F, Wang X, Wu Y H. Incremental graph pattern matching. In: Proceedings of International Conference on Management of Data, 2011. Google Scholar

[38] Valiant L G. A bridging model for parallel computation. Commun ACM, 1990, 33: 103-111 CrossRef Google Scholar

[39] Jones N D. An introduction to partial evaluation. ACM Comput Surv, 1996, 28: 480-503 CrossRef Google Scholar

[40] Ramalingam G, Reps T. On the computational complexity of dynamic graph problems. Theor Comput Sci, 1996, 158: 233-277 CrossRef Google Scholar

[41] Fan W F, Hu C M, Tian C. Incremental graph computations: doable and undoable. In: Proceedings of International Conference on Management of Data, 2017. 155--169. Google Scholar

[42] Maon Y, Schieber B, Vishkin U. Parallel ear decomposition search (EDS) and st-numbering in graphs. Theor Comput Sci, 1986, 47: 277-298 CrossRef Google Scholar

[43] Miller G L, Ramachandran V. Efficient parallel ear decomposition with applications. Unpublished Manuscript, 1986. Google Scholar

[44] Bang-Jensen J, Gutin G Z. Digraphs: Theory, Algorithms and Applications. Berlin: Springer, 2008. Google Scholar

[45] Goodrich M T. Communication-Efficient Parallel Sorting. SIAM J Comput, 1999, 29: 416-432 CrossRef Google Scholar

[46] Gallager R G, Humblet P A, Spira P M. A Distributed Algorithm for Minimum-Weight Spanning Trees. ACM Trans Program Lang Syst, 1983, 5: 66-77 CrossRef Google Scholar

[47] Fan W, Liu M, Tian C. Incrementalization of graph partitioning algorithms. Proc VLDB Endow, 2020, 13: 1261-1274 CrossRef Google Scholar

[48] Fan W, Hu C, Liu M. Dynamic scaling for parallel graph computations. Proc VLDB Endow, 2019, 12: 877-890 CrossRef Google Scholar

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号