SCIENTIA SINICA Informationis, Volume 50 , Issue 7 : 957-987(2020) https://doi.org/10.1360/SSI-2019-0271

A survey on the construction methods and applications of sci-tech big data knowledge graph

More info
  • ReceivedDec 3, 2019
  • AcceptedApr 28, 2020
  • PublishedJul 13, 2020


Recently, the knowledge graph (KG) of sci-tech and big data technology has played a paramount role in the development of the science of science. We carry out a systematic and in-depth review of KG construction and big data technology application in the sci-tech field. Specifically, we explain the issues of sci-tech entity extraction, sci-tech entity disambiguation, sci-tech relationship extraction, and sci-tech relationship inference involved in the construction of sci-tech big data KG, and give a systematized summary of the analysis and mining methods of the sci-tech big data KG, such as sci-tech entity recommendation, sci-tech community detection, sci-tech entity evaluation, interdisciplinary research, and disciplinary evolution analysis. Lastly, we give the future research and application directions of the sci-tech KG.

Funded by







[1] Fortunato S, Bergstrom C T, Börner K, et al. Science of science. Science, 2018, 359. Google Scholar

[2] Han J W. Mining heterogeneous information networks: the next frontier. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2012. 2--3. Google Scholar

[3] Xia F, Wang W, Bekele T M. Big Scholarly Data: A Survey. IEEE Trans Big Data, 2017, 3: 18-35 CrossRef Google Scholar

[4] Singhal A. Introducing the knowledge graph: Things, not strings. Official Google Blog, 2012. Google Scholar

[5] Yan J, Wang C, Cheng W. A retrospective of knowledge graphs. Front Comput Sci, 2018, 12: 55-74 CrossRef Google Scholar

[6] Wu T, Qi G, Li C. A Survey of Techniques for Constructing Chinese Knowledge Graphs and Their Applications. Sustainability, 2018, 10: 3245 CrossRef Google Scholar

[7] Liu Q, Li Y, Duan H, et al. Knowledge Graph construction techniques. Journal of Computer Research and Development, 2016, 53: 582--600 doi: 10.7544/issn1000-1239.2016.20148228. Google Scholar

[8] Liu F, Zhang X L, Kong L H. Research review on the research data repositories. New Tech Library Inform Serv, 2014, 2: 25--31 DOI: 10.11925/infotech.1003-3513.2014.02.04. Google Scholar

[9] Fayyad U, Haussler D, Stolorz P. Mining scientific data. Commun ACM, 1996, 39: 51-57 CrossRef Google Scholar

[10] Huang Y Q, Qi G Z, Zhang F Y. Extracting semi-structured information from the WEB. Journal of Software, 2000, 11: 73-78. Google Scholar

[11] Zhao J, Dong K J, Yang L, et al. E-Scholar: Improving academic search through combining metasearch with entity extraction. In: 2009 IEEE Youth Conference on Information, Computing and Telecommunication. Piscataway: IEEE, 2009. 247--250. Google Scholar

[12] Ramakrishnan C, Patnia A, Hovy E. Layout-aware text extraction from full-text PDF of scientific articles. Source Code Biol Med, 2012, 7: 7 CrossRef Google Scholar

[13] Kovačević A, Ivanović D, Milosavljević B. Automatic extraction of metadata from scientific publications for CRIS systems. Program, 2011, 45: 376-396 CrossRef Google Scholar

[14] Mesbah S, Bozzon A, Lofi C, et al. SmartPub: a platform for long-tail entity extraction from scientific publications. In: Companion Proceedings of the Web Conference 2018. New York: ACM, 2018. 191--194. Google Scholar

[15] Zheng J G, Howsmon D, Zhang B L, et al. Entity linking for biomedical literature. In: Proceedings of the ACM 8th International Workshop on Data and Text Mining in Bioinformatics. New York: ACM, 2014. 3--4. Google Scholar

[16] Grouin C. Biomedical entity extraction using machine-learning based approaches. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14). 2014. 2518--2523. Google Scholar

[17] Yadav V, Bethard S. A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg: ACL, 2018. 2145--2158. Google Scholar

[18] Takeuchi K, Collier N. Bio-medical entity extraction using support vector machines. Artificial Intelligence Med, 2005, 33: 125-137 CrossRef Google Scholar

[19] Amplayo R K, Song M. Building content-driven entity networks for scarce scientific literature using content information. In: Proceedings of the 5th Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016), 2016. 20--29. Google Scholar

[20] Sobhana N V, Mitra P, Ghosh S K. Conditional Random Field Based Named Entity Recognition in Geological text. IJCA, 2010, 1: 143-147 CrossRef Google Scholar

[21] Ekbal A, Saha S, Sikdar U K. Biomedical named entity extraction: some issues of corpus compatibilities. SpringerPlus, 2013, 2: 601 CrossRef Google Scholar

[22] Murphy T, McIntosh T, Curran J R. Named entity recognition for astronomy literature. In: Proceedings of the Australasian Language Technology Workshop 2006. 2006. 59--66. Google Scholar

[23] Ma J, Yuan H. Bi?ŁSTM+CRF?based named entity recognition in scientific papers in the field of ecological restoration technology. Proc Association Inf Sci Tech, 2019, 56: 186-195 CrossRef Google Scholar

[24] Hussain I, Asghar S. A survey of author name disambiguation techniques: 2010-2016. Knowledge Eng Rev, 2017, 32: e22 CrossRef Google Scholar

[25] Zhuang Y, Li G L, Feng J H. A survey on entity alignment of knowledge base. Journal of Computer Research and Development, 2016, 53: 165--192 doi: 10.7544/issn1000-1239.2016.20150661. Google Scholar

[26] Torvik V I, Weeber M, Swanson D R. A probabilistic similarity metric for Medline records: A model for author name disambiguation. J Am Soc Inf Sci, 2005, 56: 140-158 CrossRef Google Scholar

[27] Pereira D A, Ribeiro-Neto B, Ziviani N, et al. Using web information for author name disambiguation. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries. New York: ACM, 2009. 49--58. Google Scholar

[28] Santana A F, Gon?alves M A, Laender A H F. Incremental author name disambiguation by exploiting domain-specific heuristics. J Association Inf Sci Tech, 2017, 68: 931-945 CrossRef Google Scholar

[29] Zhang Y T, Zhang F J, Yao P R, et al. Name disambiguation in AMiner: Clustering, maintenance, and human in the loop. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery $\&$ Data Mining. New York: ACM, 2018. 1002--1011. Google Scholar

[30] Xu J, Shen S Q, Li D S, et al. A network-embedding based method for author disambiguation. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. New York: ACM, 2018. 1735--1738. Google Scholar

[31] Shen Q, Wu T, Yang H. NameClarifier: A Visual Analytics System for Author Name Disambiguation. IEEE Trans Visual Comput Graphics, 2017, 23: 141-150 CrossRef Google Scholar

[32] Hansen A R, Varon J L, Sinnott-Armstrong N A, et al. U S Patent 9 779 388. 2017-10-3. Google Scholar

[33] Prokofyev R, Demartini G, Boyarsky A, et al. Ontology-based word sense disambiguation for scientific literature. In: Proceedings of European Conference on Information Retrieval. Berlin: Springer, 2013. 594--605. Google Scholar

[34] Atzeni P, Polticelli F, Toti D. A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature. In: Proceedings of 2011 IEEE 27th International Conference on Data Engineering Workshops. Piscataway: IEEE, 2011. 59--61. Google Scholar

[35] Thomas J, Milward D, Ouzounis C, et al. Automatic extraction of protein interactions from scientific abstracts. Biocomputing 2000. 1999, 541--552. Google Scholar

[36] Blaschke C, Andrade M A, Ouzounis C A, et al. Automatic extraction of biological information from scientific text: protein-protein interactions. In: Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, 1999. 60--67. Google Scholar

[37] Skusa A, Ruegg A, Kohler J. Extraction of biological interaction networks from scientific literature. Briefings BioInf, 2005, 6: 263-276 CrossRef Google Scholar

[38] Song M, Kim W C, Lee D. PKDE4J: Entity and relation extraction for public knowledge discovery. J BioMed Inf, 2015, 57: 320-332 CrossRef Google Scholar

[39] Lee J Y, Dernoncourt F, Szolovits P. Mit at semeval-2017 task 10: Relation extraction with convolutional neural networks. arXiv preprint,. arXiv Google Scholar

[40] Yan S, Spangler S, Chen Y. Cross media entity extraction and linkage for chemical documents. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2011. Google Scholar

[41] Quan C Q, Wang M, Ren F J. An unsupervised text mining method for relation extraction from biomedical literature. PLoS ONE, 2014, 9: e102039 DOI 10.1371/journal.pone.0102039. Google Scholar

[42] Thomas P, Neves M, Solt I, et al. Relation extraction for drug-drug interactions using ensemble learning. In: CEUR Workshop Proceedings, 2011. 11--18. Google Scholar

[43] Hu K, Luo Q, Qi K. Understanding the topic evolution of scientific literatures like an evolving city: Using Google Word2Vec model and spatial autocorrelation analysis. Inf Processing Manage, 2019, 56: 1185-1203 CrossRef Google Scholar

[44] Schoenmackers S, Etzioni O, Weld D S, et al. Learning first-order horn clauses from web text. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010. 1088--1098. Google Scholar

[45] Nickel M, Tresp V, Kriegel H P. A three-way model for collective learning on multi-relational data. In: Proceedings of International Conference on Machine Learning. 2011. 809--816. Google Scholar

[46] Socher R, Chen D, Manning C D, et al. Reasoning with neural tensor networks for knowledge base completion. In: Proceedings of Advances in Neural Information Processing Systems, 2013. 926--934. Google Scholar

[47] Lao N, Mitchell T, Cohen W W. Random walk inference and learning in a large scale knowledge base. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2011. 529--539. Google Scholar

[48] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data. In: Proceedings of Advances in Neural Information Processing Systems. 2013. 2787--2795. Google Scholar

[49] Neelakantan A, Roth B, McCallum A. Compositional vector space models for knowledge base completion. 2015,. arXiv Google Scholar

[50] Das R, Neelakantan A, Belanger D, et al. Chains of reasoning over entities, relations, and text using recurrent neural networks. 2016,. arXiv Google Scholar

[51] Das R, Godbole A, Zaheer M, et al. Chains-of-Reasoning at TextGraphs 2019 Shared Task: reasoning over chains of facts for explainable multi-hop inference. In: Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13), 2019. 101--117. Google Scholar

[52] Koh Y S, Dobbie G. Indirect weighted association rules mining for academic network collaboration recommendations. In: Proceedings of the 10th Australasian Data Mining Conference, 2012. 167--173. Google Scholar

[53] Chen Z Q, Zhang H L, Ge J K, et al. Related document recommending based on weighted association rule mining. New Technology of Library and Information Service, 2007, 61-65 DOI: 10.11925/infotech.1003-3513.2007.10.13. Google Scholar

[54] Deng S W, Luo Z, Li S R, et al. Scholar recommendation system based on academic relationship of thesis co-authors. Computer Engineering, 2013, 39: 12-17 doi: 10.3969/j.issn.1000-3428.2013.02.003. Google Scholar

[55] Tang J, Wu S, Sun J M, et al. Cross-domain collaboration recommendation. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2012. 1285--1293. Google Scholar

[56] Liu Z, Xie X, Chen L. Context-aware academic collaborator recommendation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery $\&$ Data Mining. New York: ACM, 2018. 1870--1879. Google Scholar

[57] Guerra J, Quan W, Li K, et al. SCOSY: a biomedical collaboration recommendation system. In: Proceedings of 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Piscataway: IEEE, 2018. 3987--3990. Google Scholar

[58] Exploiting Publication Contents and Collaboration Networks for Collaborator Recommendation. PLoS ONE, 2016, 11: e0148492 CrossRef Google Scholar

[59] Huang W Y, Wu Z H, Liang C, et al. A neural probabilistic model for context based citation recommendation. In: Proceedings of 29th AAAI Conference on Artificial Intelligence. 2015. Google Scholar

[60] Yu S, Liu J, Yang Z. PAVE: Personalized Academic Venue recommendation Exploiting co-publication networks. J Network Comput Appl, 2018, 104: 38-47 CrossRef Google Scholar

[61] Ma X, Wang R. Personalized Scientific Paper Recommendation Based on Heterogeneous Graph Representation. IEEE Access, 2019, 7: 79887-79894 CrossRef Google Scholar

[62] Liu H, Kong X, Bai X. Context-Based Collaborative Filtering for Citation Recommendation. IEEE Access, 2015, 3: 1695-1703 CrossRef Google Scholar

[63] Jia H F, Saule E. Local is good: a fast citation recommendation approach. In: Proceedings of European Conference on Information Retrieval. Berlin: Springer, 2018. 758--764. Google Scholar

[64] Zhenzhen X, Jiang H, Kong X. Cross-domain item recommendation based on user similarity. ComSIS, 2016, 13: 359-373 CrossRef Google Scholar

[65] Cagliero L, Garza P, Pasini A, et al. Additional reviewer assignment by means of weighted association rules. IEEE Transactions on Emerging Topics in Computing, 2018 DOI: 10.1109/TETC.2018.2861214. Google Scholar

[66] Li J, Li D, Feng P H, et al. An expert recommendation model based on the speciality, scientific impact of experts, and social relationship between experts and applicants. Journal of the China Society for Scientific and Technical Information, 2017, 36: 338-345. Google Scholar

[67] Wang J, Yue F, Wang G, et al. Expert recommendation in scientific social network based on link prediction. Journal of Intelligence, 2015, 34: 151-157. Google Scholar

[68] Yang K H, Kuo T L, Lee H M, et al. A reviewer recommendation system based on collaborative intelligence. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Volume 01. Washington: IEEE Computer Society, 2009. 564--567. Google Scholar

[69] Shon H S, Han S H, Kim K A. Proposal reviewer recommendation system based on big data for a national research management institute. J Inf Sci, 2017, 43: 147-158 CrossRef Google Scholar

[70] Zhao S, Zhang D, Duan Z. A novel classification method for paper-reviewer recommendation. Scientometrics, 2018, 115: 1293-1313 CrossRef Google Scholar

[71] Jin J, Geng Q, Zhao Q, et al. Integrating the trend of research interest for reviewer assignment. In: Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 2017. 1233--1241. Google Scholar

[72] Girvan M, Newman M E J. Community structure in social and biological networks. Proc Natl Acad Sci USA, 2002, 99: 7821-7826 CrossRef Google Scholar

[73] Shi X H, Lu H T. Detecting community in scientific collaboration network with bayesian symmetric NMF. Data Analysis and Knowledge Discovery, 2017, 1: 49-56. Google Scholar

[74] Wallace M L, Gingras Y, Duhon R. A new approach for detecting scientific specialties from raw cocitation networks. J Am Soc Inf Sci, 2009, 60: 240-246 CrossRef Google Scholar

[75] Ba Z C, Li G, Zhu S W. Similarity measurement of research interests in semantic network. New Technology of Library and Information Service, 2016, 81-90 DOI: 10.11925/infotech.1003-3513.2016.04.10. Google Scholar

[76] Ding Y. Community detection: Topological vs. topical. J Informetrics, 2011, 5: 498-514 CrossRef Google Scholar

[77] Kajikawa Y, Yoshikawa J, Takeda Y. Tracking emerging technologies in energy research: Toward a roadmap for sustainable energy. Tech Forecasting Social Change, 2008, 75: 771-782 CrossRef Google Scholar

[78] Shibata N, Kajikawa Y, Takeda Y. Detecting emerging research fronts based on topological measures in citation networks of scientific publications. Technovation, 2008, 28: 758-775 CrossRef Google Scholar

[79] Yang X L. An algorithm based on correlation coefficient to find scientific communities. Dissertation for Master's Degree. Dalian: Dalian University of Technology, 2008. Google Scholar

[80] Wang X G, Cheng Q K. Analysis on evolution of research topics in a discipline based on NEViewer. Journal of the China Society for Scientific and Technical Information, 2013, 32: 900-911. Google Scholar

[81] Miao R, Liu L. Community detection in scientific collaboration network. Journal of the China Society for Scientific and Technical Information, 2011, 30: 1312-1318. Google Scholar

[82] Larsen K. Knowledge network hubs and measures of research impact, science structure, and publication output in nanostructured solar cell research. Scientometrics, 2008, 74: 123-142 CrossRef Google Scholar

[83] Hirsch J E. An index to quantify an individual's scientific research output. Proc Natl Acad Sci USA, 2005, 102: 16569-16572 CrossRef Google Scholar

[84] Braun T, Gl?nzel W, Schubert A. A Hirsch-type index for journals. Scientometrics, 2006, 69: 169-173 CrossRef Google Scholar

[85] Ball P. Index aims for fair ranking of scientists. Nature, 2005, 436: 900-900 CrossRef Google Scholar

[86] Egghe L. Theory and practise of the g-index. Scientometrics, 2006, 69: 131-152 CrossRef Google Scholar

[87] Jin B, Liang L, Rousseau R, et al. The R- and AR-indices: complementing the h-index. Chin Sci Bull, 2007, 52: 855--863. Google Scholar

[88] Burrell Q L. On the h-index, the size of the Hirsch core and Jin's A-index. J Informetrics, 2007, 1: 170-177 CrossRef Google Scholar

[89] Zhang F. Evaluation and analysis on academic influences of scholars in library and information field based on papers and funds. Information Research, 2019, 121-128. Google Scholar

[90] Cheng H J, Xu W T. Academic influence evaluation of the young talents program. Bulletin of National Natural Science Foundation of China, 2019, 33: 168-175. Google Scholar

[91] Haveliwala T H. Topic-sensitive pagerank. In: Proceedings of the 11th International Conference on World Wide Web. New York: ACM, 2002. 517--526. Google Scholar

[92] Page L, Brin S, Motwani R, et al. The PageRank citation ranking: bringing order to the web. Stanford InfoLab, 1999. Google Scholar

[93] Li X T, Ng M K, Ye Y M. HAR: Hub, authority and relevance scores in multi-relational data for query search. In: Proceedings of the 2012 SIAM International Conference on Data Mining. 2012. 141--152. Google Scholar

[94] Sinatra R, Wang D, Deville P. Quantifying the evolution of individual scientific impact. Science, 2016, 354: aaf5239-aaf5239 CrossRef Google Scholar

[95] Park N, Kan A, Dong X L, et al. Estimating node importance in knowledge graphs using graph neural networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery $\&$ Data Mining (KDD'19). New York: ACM, 2019. 596--606. Google Scholar

[96] Hicks D. Overview of models of performance-based research funding systems. In: Performance-based Funding for Public Research in Tertiary Education Institutions: Workshop Proceedings. Paris: OECD, 2010. Google Scholar

[97] Zhang S H, Sun S D. Evaluation of high-tech research project based on Internal Fuzzy TOPSIS and AHP. Journal of Shanghai Jiaotong University, 2011, 45: 134--138. Google Scholar

[98] Hua F. Study of indexes for quantitative evaluation of research performance based on discipline benchmarking. Library and Information Service, 2014, 58: 78--84. Google Scholar

[99] Shu Y. Design for the indicator system of scientific research evaluation based on factor analysis and variance maximization model. Journal of Intelligence, 2015, 34: 33-37. Google Scholar

[100] Wang D, Song C, Barabási A L. Quantifying Long-Term Scientific Impact. Science, 2013, 342: 127-132 CrossRef Google Scholar

[101] Haythornthwaite C. Learning and knowledge networks in interdisciplinary collaborations. J Am Soc Inf Sci, 2006, 57: 1079-1092 CrossRef Google Scholar

[102] Bordons M, Morillo F, Gómez I. Analysis of cross-disciplinary research through bibliometric tools. In: Handbook of Quantitative Science and Technology Research. Dordrecht: Springer, 2004. 437--456. Google Scholar

[103] Buter R K, Noyons E C M, Van Raan A F J. Searching for converging research using field to field citations. Scientometrics, 2011, 86: 325-338 CrossRef Google Scholar

[104] Xu H, Pu W Y, Qian A B, et al. Knowledge mapping of Chinese medicine interdisciplinary research field. Acta Academiae Medicinae Sinicae, 2015, 37: 93-100 doi: 10.3881/j.issn.1000-503X.2015.01.018. Google Scholar

[105] Sun H S. Empirical study on knowledge citations of other disciplines to information science. Journal of Intelligence, 2013, 32: 113-118+100. Google Scholar

[106] Wang Q. Measuring interdisciplinarity of a given body of research. In: Proceedings of ISSI 2015 Istanbul: 15th International Society of Scientometrics and Informetrics Conference. 2015. 372--383. Google Scholar

[107] Han Z Q, Liu X P, Kou J J. Interdisciplinary literature discovery based on Rao-Stirling diversity indices: Case studies in nanoscience and nanotechnology. Information Science, 2020, 38: 116-124. Google Scholar

[108] Xu H Y, Guo T, Yue Z H, et al. Study on the interdisciplinary topics of information science based in TI index series. Journal of the China Society for Scientific and Technical Information, 2015, 34: 1067-1078. Google Scholar

[109] Buter R K, Noyons E C M, van Raan A F J. Identification of converging research areas using publication and citation data. Res Eval, 2010, 19: 19-27 CrossRef Google Scholar

[110] Karlovčec M, Mladenić D. Interdisciplinarity of scientific fields and its evolution based on graph of project collaboration and co-authoring. Scientometrics, 2015, 102: 433-454 CrossRef Google Scholar

[111] Rinia E J, Van Leeuwen T N, Bruins E E W. Scientometrics, 2001, 51: 293-309 CrossRef Google Scholar

[112] Bromham L, Dinnage R, Hua X. Interdisciplinary research has consistently lower funding success. Nature, 2016, 534: 684-687 CrossRef Google Scholar

[113] Tshitoyan V, Dagdelen J, Weston L. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 2019, 571: 95-98 CrossRef Google Scholar

[114] Chang Y W, Huang M H. A study of the evolution of interdisciplinarity in library and information science: Using three bibliometric methods. J Am Soc Inf Sci, 2012, 63: 22-33 CrossRef Google Scholar

[115] Palla G, Barabási A L, Vicsek T. Quantifying social group evolution. Nature, 2007, 446: 664-667 CrossRef Google Scholar

[116] Zhang F L, Liu J J. Visual analysis on research status and themes evolution of discipline construction research in China. Agricultural Library and Information, 2019, 31: 31--39. Google Scholar

[117] Guo X Y, Zhang J J, Liu J. Chinese and foreign research progress and trend evolution of agricultural engineering in the past 20 years: Based on the perspective of Bibliometrics and Social Network Analysis. Jiangsu Agricultural Sciences, 2019, 47: 1--9. Google Scholar

[118] Han M Z, Li Y. Research on the evolution of discipline knowledge based on concept flows. In: Proceedings of the 2nd International Conference on Humanities Education and Social Sciences (ICHESS 2019). Paris: Atlantis Press, 2019. Google Scholar

[119] Guan P, Wang Y F, Fu Z. Analyzing topic semantic evolution with LDA: case study of Lithium Ion Batteries. Data Analysis and Knowledge Discovery, 2019, 3: 61-72 DOI: 10.11925/infotech.2096-3467.2018.1404. Google Scholar

[120] Wu L, Liang X H, Song H Y. Empirical study of coevolution analysis based on technological keyword. Journal of Modern Information, 2019, 39: 137-142. Google Scholar

[121] Yang C. Collaborator recommendation on research social network platforms. Dissertation for Ph.D. Degree. Hefei: University of Science and Technology of China, 2015. Google Scholar

[122] Beel J, Gipp B, Langer S. Research-paper recommender systems: a literature survey. Int J Digit Libr, 2016, 17: 305-338 CrossRef Google Scholar

[123] Zhang P L. Design and implementation of patent recommendation system based on knowledge graph. Dissertation for Master's Degree. Jinan: Shandong University, 2019. Google Scholar

[124] Ishag M I M, Park K H, Lee J Y. A Pattern-Based Academic Reviewer Recommendation Combining Author-Paper and Diversity Metrics. IEEE Access, 2019, 7: 16460-16475 CrossRef Google Scholar

[125] Zhao J M, Qiu J P, Huang K, et al. A new scientometric indicator -review on h index and its applications. Bulletin of National Natural Science Foundation of China, 2008, 23-32. Google Scholar

[126] Ping S Q. Research on some issues of community detection in complex networks. Dissertation for Ph.D. Degree. Changchun: Jilin University, 2019. Google Scholar

[127] Garfield E. Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas. Science, 1955, 122: 108-111 CrossRef Google Scholar

[128] Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of Advances in Neural Information Processing Systems. 2016. 3844--3852. Google Scholar

[129] Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graph. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 1024--1034. Google Scholar

[130] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. 2016,. arXiv Google Scholar

[131] Veličković P, Cucurull G, Casanova A, et al. Graph attention networks. 2017,. arXiv Google Scholar

[132] Ying R, He R, Chen K, et al. Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery $\&$ Data Mining. New York: ACM, 2018. 974--983. Google Scholar

[133] Klein J T. A conceptual vocabulary of interdisciplinary science. Practising Interdisciplinarity, 2000, 3-24. Google Scholar

[134] Yang J L, Sun M J. Mining the information about discipline intercrossing from citation index data. Journal of the China Society for Scientific and Technical Information, 2004, 23: 672-676. Google Scholar

[135] Xu H Y, Dong K, Kui L. Research on Interdisciplinary Subject Identification and Prediction Methods. Beijing: Scientific and Technical Documentation Press, 2019. 18. Google Scholar

[136] Xu H Y, Yin C X, Guo T, et al. Interdisciplinary research review. Library and Information Service, 2015, 59: 119-127. Google Scholar

[137] Brillouin L. Science and Information Theory. Massachusetts: Courier Corporation, 2013. Google Scholar

[138] Lerman R I, Yitzhaki S. A note on the calculation and interpretation of the Gini index. Economics Lett, 1984, 15: 363-368 CrossRef Google Scholar

[139] Chen B, Tsutsui S, Ding Y. Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval. J Informetrics, 2017, 11: 1175-1189 CrossRef Google Scholar

[140] Huutoniemi K, Rafols I. Interdisciplinarity in Research Evaluation. Oxford: Oxford University Press, 2016. Google Scholar

[141] Jang W, Kwon H, Park Y. Predicting the degree of interdisciplinarity in academic fields: the case of nanotechnology. Scientometrics, 2018, 116: 231-254 CrossRef Google Scholar

[142] Afterword: the emergent literature on interdisciplinary and transdisciplinary research evaluation. res eval, 2006, 15: 75-80 CrossRef Google Scholar

[143] Klein J T. Evaluation of Interdisciplinary and Transdisciplinary Research. Am J Preventive Med, 2008, 35: S116-S123 CrossRef Google Scholar

[144] Stirling A. A general framework for analysing diversity in science, technology and society. J R Soc Interface, 2007, 4: 707-719 CrossRef Google Scholar

[145] Repko A F, Szostak R. Interdisciplinary Research: Process and Theory. Los Angeles: Sage, 2008. Google Scholar

[146] Andrew B, Born G. Interdisciplinarity: Reconfigurations of the Social and Natural Sciences. New York: Routledge, 2013. Google Scholar

[147] Rafols I, Leydesdorff L, O'Hare A. How journal rankings can suppress interdisciplinary research: A comparison between Innovation Studies and Business & Management. Res Policy, 2012, 41: 1262-1282 CrossRef Google Scholar

[148] Asur S, Parthasarathy S, Ucar D. An event-based framework for characterizing the evolutionary behavior of interaction graphs. ACM Transactions on Knowledge Discovery from Data (TKDD), 2009, 3: 16. Google Scholar

[149] Zhao Z, Li C, Zhang X. An incremental method to detect communities in dynamic evolving social networks. Knowledge-Based Syst, 2019, 163: 404-415 CrossRef Google Scholar

[150] Chen C. CiteSpace: Visualizing patterns and trends in scientific literature. Retrieved January, 2010, 27: 2010. Google Scholar

[151] Bastian M, Heymann S, Jacomy M. Gephi: an open source software for exploring and manipulating networks. In: Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media. 2009. Google Scholar

[152] Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal, Complex Systems, 2006, 1695: 1-9. Google Scholar

[153] Hagberg A, Swart P, Chult D S. Exploring network structure, dynamics, and function using NetworkX. 2008. Google Scholar

[154] Tang J, Zhang J, Yao L M, et al. AMiner: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'2008). New York: ACM, 2008. 990--998. Google Scholar

[155] Sinha A, Shen Z, Song Y, et al. An overview of microsoft academic service (mas) and applications. In: Proceedings of the 24th International Conference on World Wide Web (WWW'15 Companion). New York: ACM, 2015. 243--246. Google Scholar

[156] Zhou Y C, Chang Q L, Du Y. SKS: a platform for big data based scientific knowledge graph. Frontiers of Data and Computing, 2019, 1: 82-93. Google Scholar

  • Figure 1

    (Color online) Entity and relation model of the sci-tech knowledge graph

  • Figure 2

    (Color online) Construction and application framework of sci-tech knowledge graph

  • Figure 3

    (Color online) Two ambiguous forms of the author's name

  • Figure 4

    (Color online) Discipline vectorized

  • Figure 5

    (Color online) Examples of community evolution events

  • Table 1   Data analysis and application of sci-tech Knowledge Graph
    Sci-tech entity recommendation Application scenario Association rule based Content based Collaborative filtering based Hybrid approach
    Collaboration recommendation [52,53] [54,55] [56,57] [58]
    Resource recommendation [53,59] [60,61] [62,63] [64]
    Expert finding [65] [66] [67,68] [69-71]
    Sci-tech community discovery Application scenario Collaboration based Achievement citation based Keyword based Hybrid approach
    Similar interest researcher community [72,73] [74] [75] [81]
    Similar research topic community [76] [77-79] [80] [81,82]
    Sci-tech entity evaluation Application scenario Index impact based Combining qualitative and quantitative Complex network based Neural network based
    Scholar impact evaluation [83-88] [89,90] [91-94] [95]
    Project evaluation [96,97]
    Other entity evaluation [98] [90,99] [91-93,100] [95]
    Interdisciplinaryresearch Application scenario Collaboration based Citation based Achievement content based Hybrid approach
    Knowledge transfer [101] [102,103] [104,105]
    Topic discovery [106] [107,108] [109]
    Pattern discovery [110] [111] [112,113] [114]
    Discipline evolution research Application scenario Citation based Co-word based Keyword based
    Sci-tech community evolution [115]
    Discipline and topic evolution [116-118] [119,120]
  • Table 2   Comparison of related models$^{\rm~a)}$
    GNN-based HAR [93] PPR [92] PR [91]
    Neighborhood $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$
    Predicate $\checkmark$ $\checkmark$ $\times$ $\times$
    Centrality $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$
    Input Score $\checkmark$ $\checkmark$ $\checkmark$ $\times$
    Flexibility $\checkmark$ $\times$ $\times$ $\times$

    a) 表格中“$\checkmark$"表示支持, “$\times$"表示不支持.

  • Table 3   List of common sci-tech resource data sources
    Name Description Acquisition method Full text access$^{\rm~a)}$ Data provider
    CNKI Literature of multiple disciplines, include journals, dissertations, patents, etc. Crawling Subscribe China Academic Journals Electronic Publishing House Co., Ltd.
    WANFANG Data Literature of multiple disciplines, include journals, dissertations, patents, etc. Crawling Subscribe WANFANG DATA CO., LTD.
    Chongqing VIP Literature of multiple disciplines, include journals. Crawling Subscribe Chongqing VIP Information Co., Ltd.
    IEEE Xplore Mainly covering computer, engineering, electronics and other disciplines. Crawling Subscribe IEEE
    ScienceDirect Mainly covering physical sciences and engineering, life sciences, etc. Crawling Subscribe Elsevier
    Scopus Mainly covering chemical sciences, biological sciences, medical $\&$ health sciences, etc. Crawling Subscribe Elsevier
    PQDT Mainly include excellent dissertations from well-known universities in Europe and United States, covering multiple disciplines. Crawling Subscribe ProQuest
    Web of Knowledge Citation indexing of academic literature, covering multiple disciplines. Crawling Subscribe Thomson Reuters
    AMiner Mainly covering computer science, etc., providing downloads of dataset. Free dataset Full text link Tsinghua University
    MAG Mainly covering multiple disciplines, providing downloads of dataset. Free dataset Full text link Microsoft
    arXiv Preprint platform, covering physics, mathematics, computer, economics and other disciplines, providing downloads of full text. Crawling Free full text Cornell University
    DBLP An integrated database system for computer sciences journals and conferences, providing downloads of dataset. Free dataset Full text link University of Trier; Schloss Dagstuhl
    Baidu Scholar An indexing and retrieval platform for academic literature, covering multiple disciplines. Crawling Full text link Baidu
    Google Scholar An indexing and retrieval platform for academic literature, covering multiple disciplines. Crawling Full text link Google
    CAS IR Grid Academic research achievements indexing about research institutions of the Chinese Academy of Sciences. Crawling Mostly restricted Chinese Academy of Sciences
    TAIR Academic research achievements indexing of Taiwan universities. Crawling Mostly restricted Taiwan University

    a) 全文获取方式中, “Full text link” 表示资源提供了全文获取的链接, 全文是否免费要看链接的形式, 如谷歌学术资 源链接到PDF 文件时, 往往能免费获取到全文.

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号