logo

SCIENTIA SINICA Informationis, Volume 50 , Issue 4 : 551-575(2020) https://doi.org/10.1360/N112018-00225

Research progress of large-scale knowledge graph completion technology

More info
  • ReceivedAug 22, 2018
  • AcceptedMar 17, 2019
  • PublishedApr 13, 2020

Abstract

With the continued growth of various knowledge graphs, such as Google Knowledge Map, DBpedia, Microsoft Concept Graph, and YAGO, the knowledge representation system, constructed based on RDF, has become more well-known. The RDF triple format has become the basic description of knowledge in the real world. Due to its simple structure and clear logic, it is easy to understand and implement. Nevertheless, when faced with extremely complicated knowledge and common sense, complete knowledge can become difficult to describe. The construction process of knowledge graphs is bound to lead to incomplete knowledge contained in the graphs. At this point, the knowledge-based completion technology is particularly important for managing such situations. Any existing knowledge graph must be improved continuously through completion technology and newly inferred knowledge. Beginning with the construction of a knowledge graph, this paper divides the problem of knowledge graph completion into two levels: concept completion and instance completion. (1) The concept completion level primarily focuses on the completion of entity types. It is described in terms of three development stages: a logical reasoning mechanism, based on description logic, a type inference mechanism, based on traditional machine learning, and a type inference mechanism, based on representation learning. (2) The instance completion level can be further divided into an RDF triple completion and new instance discovery. This paper focuses on RDF triples completion learning, which includes entity completion or relationship completion and is described in three development stages, such as statistical relational learning, probability learning based on random walks, and knowledge representation learning. Through the review and discussion of the research process, the development status, and the latest progress in the above-mentioned large-scale knowledge graph completion, we present the challenges that the technology will face and the development prospects of future work.


Funded by

国家自然科学基金(61532010,61532016,91846204,91646203,61762082)

国家重点研发计划(2016YFB1000602,2016YFB1000603)

中国人民大学科学研究基金(11XNL010)

河南省科技开放合作(172106000077)


References

[1] Lenat D B. CYC: a large-scale investment in knowledge infrastructure. Commun ACM, 1995, 38: 33--38. Google Scholar

[2] Miller G A. WordNet: a lexical database for English. Commun ACM, 1995, 38: 39--41. Google Scholar

[3] Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res, 2004, 32: 267--270. Google Scholar

[4] Bollacker K D, Evans C, Paritosh P, et al. Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of International Conference on Management of Data, 2008. 1247--1250. Google Scholar

[5] Fan J, Ferrucci D A, Gondek D, et al. PRISMATIC: inducing knowledge from a large scale lexicalized relation resource. In: Proceedings of North American Chapter of the Association for Computational Linguistics, 2010. 122--127. Google Scholar

[6] Fader A, Soderland S, Etzioni O. Identifying relations for open information extraction. In: Proceedings of Empirical Methods in Natural Language Processing, 2011. 1535--1545. Google Scholar

[7] Nakashole N, Weikum G, Suchanek F M. PATTY: a taxonomy of relational patterns with semantic types. In: Proceedings of Empirical Methods in Natural Language Processing, 2012. 1135--1145. Google Scholar

[8] Niu F, Zhang C, Ré C. Elementary: large-scale knowledge-base construction via machine learning andstatistical inference. Int J Semantic Web Inf Syst, 2012, 8: 42-73 CrossRef Google Scholar

[9] Nickel M, Murphy K, Tresp V. A Review of Relational Machine Learning for Knowledge Graphs. Proc IEEE, 2016, 104: 11-33 CrossRef Google Scholar

[10] Schmitz M, Soderland S, Bart R, et al. Open language learning for information extraction. In: Proceedings of Empirical Methods in Natural Language Processing, 2012. 523--534. Google Scholar

[11] Hoffart J, Suchanek F M, Berberich K. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence, 2013, 194: 28-61 CrossRef Google Scholar

[12] Dong X L, Gabrilovich E, Heitz G, et al. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of Knowledge Discovery and Data Mining, 2014. 601--610. Google Scholar

[13] Nemoto Y, Akasaka F, Chiba R. Establishment of a function embodiment knowledge base for supporting service design. Sci China Inf Sci, 2012, 55: 1008-1018 CrossRef Google Scholar

[14] V$\ddot{\rm~~o}$lker J, Niepert M. Statistical schema induction. In: The Semantic Web: Research and Applications. Berlin: Springer, 2011. 124--138. Google Scholar

[15] Lehmann J. DL-Learner: learning concepts in description logics. J Mach Learn Res, 2009, 10: 2639--2642. Google Scholar

[16] Gangemi A, Nuzzolese A G, Presutti V, et al. Automatic typing of DBpedia entities. In: Proceedings of International Semantic Web Conference, 2012. 65--81. Google Scholar

[17] Chien S. Static and completion analysis for planning knowledge base development and verification. In: Proceedings of the Third International Conference on Artificial Intelligence Planning Systems, Edinburgh, 1996. 53--61. Google Scholar

[18] Chien S A. Static and completion analysis for knowledge acquisition, validation and maintenance of planning knowledge bases. Int J Human-Comput Studies, 1998, 48: 499-519 CrossRef Google Scholar

[19] Baader F, Ganter B, Sertkaya B, et al. Completing description logic knowledge bases using formal concept analysis. In: Proceedings of International Joint Conference on Artificial Intelligence, 2007. 230--235. Google Scholar

[20] Sertkaya B. Explaining user errors in knowledge base completion. In: Proceedings of the 21st International Workshop on Description Logics (DL2008), Dresden, 2008. Google Scholar

[21] Baader F, Sertkaya B. Usability issues in description logic knowledge base completion. In: Proceedings of the 7th International Conference of Formal Concept Analysis (ICFCA 2009), Darmstadt, 2009. 1--21. Google Scholar

[22] Paulheim H, Bizer C. Type inference on noisy RDF data. In: Proceedings of International Semantic Web Conference. Berlin: Springer, 2013. 510--525. Google Scholar

[23] Zhang L, Zhang X, Zhao L, et al. An embedding-based system to constructing OWL ontologies. In: Proceedings of the 16th International Semantic Web Conference, California. Google Scholar

[24] Nuzzolese A G, Gangemi A, Presutti V, et al. Type inference through the analysis of wikipedia links. In: Proceedings of the LDOW2012, 2012. Google Scholar

[25] Wu T, Ling S, Qi G, et al. Mining type information from chinese online encyclopedias. In: Proceedings of the 4th Joint International Conference, 2014. 213--229. Google Scholar

[26] Kelloumenouer K, Kedad Z. Discovering types in RDF datasets. In: Proceedings of International Semantic Web Conference, 2015. 77--81. Google Scholar

[27] Ma C, Yan D, Wang Y P, et al. Advanced graph model for tainted variable tracking. Sci China Ser F-Inf Sci, 2013, 56: 112105. Google Scholar

[28] Wang P, Xu B W, Wu Y R, et al. Link prediction in social networks: the state-of-the-art. Sci China Ser F-Inf Sci, 2015, 58: 011101. Google Scholar

[29] Khosravi H, Bina B. A survey on statistical relational learning. In: Advances in Artificial Intelligence. Berlin: Springer, 2010. 256--268. Google Scholar

[30] Getoor L, Mihalkova L. Learning statistical models from relational data. In: Proceedings of International Conference on Management of Data, 2011. 1195--1198. Google Scholar

[31] Koller D. Probabilistic relational models. In: Lecture Notes in Computer Science. Berlin: Springer, 1999. Google Scholar

[32] Ngo L, Haddawy P. Answering queries from context-sensitive probabilistic knowledge bases. Theor Comput Sci, 1997, 171: 147-177 CrossRef Google Scholar

[33] Kersting K, de Raedt L. Adaptive Bayesian logic programs. In: Proceedings of International Conference on Inductive Logic Programming, 2001. 104--117. Google Scholar

[34] Taskar B, Abbeel P, Koller D. Discriminative probabilistic models for relational data. In: Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, 2002. 485--492. Google Scholar

[35] Richardson M, Domingos P. Markov logic networks. Mach Learn, 2006, 62: 107-136 CrossRef Google Scholar

[36] Muggleton S. Stochastic logic programs. In: Proceedings of the 5th International Workshop on Inductive Logic Programming. Amsterdam: IOS Press, 1996. 254--264. Google Scholar

[37] Sato T, Kameya Y. PRISM: a symbolic-statistical modeling language. In: Proceedings of 15th International Joint Conference on Artifical Intelligence, 1997. 1330--1339. Google Scholar

[38] Kersting K, Raiko T, Kramer S, et al. Towards discovering structural signatures of protein folds based on logical hidden markov models. Biocomputing, 2003, 2003: 192--203. Google Scholar

[39] Dietrich D, Schulz E. Relational learning for collective classification of entities in images. In: Proceedings of Association for the Advancement of Artificial Intelligence Workshop on Statistical Relational Ai, 2010. 79--110. Google Scholar

[40] Rettinger A, Nickles M, Tresp V. Statistical relational learning of trust. Mach Learn, 2011, 82: 191-209 CrossRef Google Scholar

[41] Rios M, Specia L, Gelbukh A, et al. Statistical relational learning to recognise textual entailment. In: Proceedings of International Conference on Computational Linguistics and Intelligent Text Processing. New York: Springer, 2014. 330--339. Google Scholar

[42] Wang W Y, Cohen W W. Joint information extraction and reasoning: a scalable statistical relational learning approach. In: Proceedings of Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, 2015. 355--364. Google Scholar

[43] Yang S, Korayem M, Aljadda K, et al. Application of statistical relational learning to hybrid recommendation systems. 2016. Google Scholar

[44] Natarajan S, Soni A, Wazalwar A, et al. Deep distant supervision: learning statistical relational models for weak supervision in natural language extraction. In: Solving Large Scale Learning Tasks. Challenges and Algorithms. Berlin: Springer, 2016. Google Scholar

[45] Yang S, Korayem M, AlJadda K. Combining content-based and collaborative filtering for job recommendation system: A cost-sensitive Statistical Relational Learning approach. Knowledge-Based Syst, 2017, 136: 37-45 CrossRef Google Scholar

[46] Stefano T. Statistical relational learning for proteomics: function, interactions and evolution. Dissertation for Ph.D. Degree. Trento: University of Trento, 2013. Google Scholar

[47] Montoya L A, Pluth M D. Hydrogen sulfide deactivates common nitrobenzofurazan-based fluorescent thiol labeling reagents.. Anal Chem, 2014, 86: 6032-6039 CrossRef PubMed Google Scholar

[48] Cilia E, Teso S, Ammendola S. Predicting virus mutations through statistical relational learning.. BMC BioInf, 2014, 15: 309 CrossRef PubMed Google Scholar

[49] Renkens J, Shterionov D, Broeck G V D, et al. ProbLog2: from probabilistic programming to statistical relational learning. In: Proceedings of the NIPS Probabilistic Programming Workshop, 2012. Google Scholar

[50] Farnadi G. Statistical relational learning towards modelling social media users. In: Proceedings of International Conference on Artificial Intelligence, Buenos Aires, 2015. 4365--4366. Google Scholar

[51] Farnadi G, Bach S H, Blondeel M, et al. Statistical relational learning with soft quantifiers. In: Proceedings of International Conference on Inductive Logic Programming. Berlin: Springer, 2015. 60--75. Google Scholar

[52] Farnadi G, Bach S H, Moens M F. Soft quantification in statistical relational learning. Mach Learn, 2017, 106: 1971-1991 CrossRef Google Scholar

[53] Popescul R, Ungar L H. Statistical relational learning for link prediction. In: Proceedings of the Workshop on Learning Statistical Models from Relational Data at IJCAI-2003, 2003. Google Scholar

[54] Wei Z, Zhao J, Liu K, et al. Large-scale knowledge base completion: inferring via grounding network sampling over selected instances. In: Proceedings of Conference on Information and Knowledge Management, 2015. 1331--1340. Google Scholar

[55] Minervini P, d'Amato C, Fanizzi N. Efficient energy-based embedding models for link prediction in knowledge graphs. J Intell Inf Syst, 2016, 47: 91-109 CrossRef Google Scholar

[56] Galárraga L, Razniewski S, Amarilli A, et al. Predicting completeness in knowledge bases. In: Proceedings of the 10th ACM International Conference on Web Search and Data Mining, Cambridge, 2017. 375--383. Google Scholar

[57] Yang F, Yang Z, Cohen W W. Differentiable learning of logical rules for knowledge base completion. CoRR abs/1702.08367 (2017). Google Scholar

[58] Aggarwal C C, Xie Y, Yu P S. On dynamic link inference in heterogeneous networks. In: Proceedings of Siam International Conference on Data Mining, 2012. 415--426. Google Scholar

[59] Kliegr T, Zamazal O. Towards linked hypernyms dataset 2.0: complementing DBpedia with hypernym discovery and statistical type inference. In: Proceedings of Language Resources and Evaluation Conference, 2014. Google Scholar

[60] Davis J, Ong I M, Struyf J, et al. Change of representation for statistical relational learning. In: Proceedings of International Joint Conference on Artificial Intelligence, 2007. 2719--2726. Google Scholar

[61] Rossi R A, Mcdowell L K, Aha D W, et al. Transforming graph representations for statistical relational learning. 2012,. arXiv Google Scholar

[62] Schulte O, Khosravi H, Kirkpatrick A E. Modelling relational statistics with Bayes Nets. Mach Learn, 2014, 94: 105-125 CrossRef Google Scholar

[63] Neelakantan A, Chang M. Inferring missing entity type instances for knowledge base completion: new dataset and methods. In: Proceedings of North American Chapter of the Association for Computational Linguistics, 2015. 515--525. Google Scholar

[64] Miao Q, Fang R, Song S, et al. Automatic identifying entity type in linked data. In: Proceedings of Pacific Asia Conference on Language, Information and Computation (PACLIC 30), 2016. 383--390. Google Scholar

[65] Xu B, Zhang Y, Liang J, et al. Cross-lingual type inference. In: Proceedings of International Conference on Database Systems for Advanced Applications. Berlin: Springer, 2016. 447--462. Google Scholar

[66] Kirschnick J, Hemsen H, Markl V. JEDI: joint entity and relation detection using type inference. In: Proceedings of Acl-2016 System Demonstrations, 2016. 61--66. Google Scholar

[67] Zhu H Y, Zeng Y, Wang D S, et al. Relation inference and type identification based on brain knowledge graph. In: Proceedings of International Conference on Brain and Health Informatics. Berlin: Springer, 2016. 221--230. Google Scholar

[68] Kuhn P, Mischkewitz S, Ring N, et al. Type inference on wikipedia list pages. In: Lecture Notes in Informatics. Berlin: Springer, 2016. 2010--2111. Google Scholar

[69] Zhou H, Zouaq A, Inkpen D. DBpedia entity type detection using entity embeddings and N-Gram models. In: Proceedings of International Conference on Knowledge Engineering and the Semantic Web, 2017. 309--322. Google Scholar

[70] Abhishek, Anand A, Awekar A. Fine-grained entity type classification by jointly learning representations and label embeddings. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017. 797--807. Google Scholar

[71] Shimaoka S, Stenetorp P, Inui K, et al. Neural architectures for fine-grained entity type classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017. 1271--1280. Google Scholar

[72] Murty S, Verga P, Vilnis L, et al. Finer grained entity typing with TypeNet. In: Proceedings of Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 2017. Google Scholar

[73] Page L, Brin S, Motwani R, et al. The Pagerank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford: Stanford University, 1998. Google Scholar

[74] Haveliwala T, Kamvar A, Jeh G. An analytical comparison of approaches to personalizing PageRank. Stanford, 2003. Google Scholar

[75] Diligenti M, Gori M, Maggini M. Learning web page scores by error back-propagation. In: Proceedings of International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc. 2005. 684--689. Google Scholar

[76] Minkov E, Cohen W W. Learning graph walk based similarity measures for parsed text. In: Proceedings of Empirical Methods in Natural Language Processing, 2008. 907--916. Google Scholar

[77] Lao N, Cohen W W. Relational retrieval using a combination of path-constrained random walks. In: Proceedings of European Conference on Principles of Data Mining and Knowledge Discovery, 2010. 81: 53--67. Google Scholar

[78] Lao N, Mitchell T M, Cohen W W. Random walk inference and learning in a large scale knowledge base. In: Proceedings of Empirical Methods in Natural Language Processing, 2011. 529--539. Google Scholar

[79] Lao N, Minkov E, Cohen W W. Learning relational features with backward random walks. In: Proceedings of Meeting of the Association for Computational Linguistics, 2015. 666--675. Google Scholar

[80] Gardner M, Mitchell T M. Efficient and expressive knowledge base completion using subgraph feature extraction. In: Proceedings of Empirical Methods in Natural Language Processing, 2015. 1488--1498. Google Scholar

[81] Wang Q, Liu J, Luo Y, et al. Knowledge base completion via coupled path ranking. In: Proceedings of Meeting of the Association for Computational Linguistics, 2016. 1308--1318. Google Scholar

[82] Lao N, Subramanya A, Pereira F, et al. Reading the web with learned syntactic-semantic inference rules. In: Proceedings of Empirical Methods in Natural Language Processing, 2012. 1017--1026. Google Scholar

[83] Gardner M, Talukdar P P, Krishnamurthy J, et al. Incorporating vector space similarity in random walk inference over knowledge bases. In: Proceedings of Empirical Methods in Natural Language Processing, 2014. 397--406. Google Scholar

[84] Lin X, Liang Y, Guan R. Compositional learning of relation paths embedding for knowledge base completion. CoRR abs/1611.07232 (2016). Google Scholar

[85] Gardner M, Talukdar P P, Kisiel B, et al. Improving learning and inference in a large knowledge-base using latent syntactic cues. In: Proceedings of Empirical Methods in Natural Language Processing, 2013. Google Scholar

[86] Shi B, Weninger T. Discriminative predicate path mining for fact checking in knowledge graphs. Knowledge-Based Syst, 2016, 104: 123-133 CrossRef Google Scholar

[87] Bordes A, Weston J, Collobert R, et al. Learning structured embeddings of knowledge bases. In: Proceedings of National Conference on Artificial Intelligence, 2011. 301--306. Google Scholar

[88] Socher R, Chen D, Manning C D, et al. Reasoning with neural tensor networks for knowledge base completion. In: Proceedings of Neural Information Processing Systems, 2013. 926--934. Google Scholar

[89] Nickel M, Tresp V, Kriegel H. A three-way model for collective learning on multi-relational data. In: Proceedings of International Conference on Machine Learning, 2011. 809--816. Google Scholar

[90] Balazevic I, Allen C, Hospedales T M. TuckER: tensor factorization for knowledge graph completion. 2019. Google Scholar

[91] Bordes A, Usunier N, Garciaduran A, et al. Translating embeddings for modeling multi-relational data. In: Proceedings of Neural Information Processing Systems, 2013. 2787--2795. Google Scholar

[92] Bordes A, Glorot X, Weston J. A semantic matching energy function for learning with multi-relational data. Mach Learn, 2014, 94: 233-259 CrossRef Google Scholar

[93] Lin Y, Liu Z, Sun M, et al. Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of National Conference on Artificial Intelligence, 2015. 2181--2187. Google Scholar

[94] Ji G, He S, Xu L, et al. Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of Meeting of the Association for Computational Linguistics, 2015. 687--696. Google Scholar

[95] Xiao H, Huang M, Hao Y, et al. TransA: an adaptive approach for knowledge graph embedding. 2015. ArXiv: 1509.05490. Google Scholar

[96] Minervini P, d'Amato C, Fanizzi N, et al. Efficient learning of entity and predicate embeddings for link prediction in knowledge graphs. In: Proceedings of the 11th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2015) co-located with the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, 2015. 26--37. Google Scholar

[97] Garcia-Duran A, Bordes A, Usunier N. Combining Two and Three-Way Embedding Models for Link Prediction in Knowledge Bases. jair, 2016, 55: 715-742 CrossRef Google Scholar

[98] Neelakantan A, Roth B, Mccallum A. Compositional Vector Space Models for Knowledge Base Completion. In: Proceedings of Meeting of the Association for Computational Linguistics, 2015. 156--166. Google Scholar

[99] Yang B, Yih W, He X, et al. Embedding entities and relations for learning and inference in knowledge bases. In: Proceedings of International Conference on Learning Representations, 2015. Google Scholar

[100] Wang Q, Wang B, Guo L. Knowledge base completion using embeddings and rules. In: Proceedings of International Joint Conference on Artificial Intelligence, 2015. 1859--1865. Google Scholar

[101] Aiguier M, Atif J, Bloch I, et al. Some algebraic results in description logics: free model and inclusions, finite basis theorem, and completion of knowledge bases. 2015. ArXiv: 1502.07634v2. Google Scholar

[102] Feng J, Huang M, Wang M, et al. Knowledge graph embedding by flexible translation. In: Proceedings of the 15th International Conference on Principles of Knowledge Representation and Reasoning, Cape Town, 2016. 557--560. Google Scholar

[103] Bordes A, Glorot X, Weston J, et al. A semantic matching energy function for learning with multi-relational data.Mach Learn, 2014, 94: 233--259. Google Scholar

[104] Xiao H, Huang M, Zhu X. From one point to a manifold: knowledge graph embedding for precise link prediction. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, 2016. 1315--1321. Google Scholar

[105] Li M M, Jia Y, Wang Y, et al. Hierarchy-based link prediction in knowledge graphs. In: Proceedings of International World Wide Web Conference, 2016. 77--78. Google Scholar

[106] Shijia E, Jia S, Xiang Y, et al. Knowledge graph embedding for link prediction and triplet classification. In: Proceedings of the 1st China Conference, Beijing, 2016. 228--232. Google Scholar

[107] Nguyen D Q, Sirts K, Qu L, et al. Neighborhood mixture model for knowledge base completion. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, 2016. 40--50. Google Scholar

[108] Tay Y, Luu T A, Hui C S. Non-parametric estimation of multiple embeddings for link prediction on dynamic knowledge graphs. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, 2017. 1243--1249. Google Scholar

[109] Hayashi K, Shimbo M. On the equivalence of holographic and complex embeddings for link prediction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 2017. 2: 554--559. Google Scholar

[110] Fan M, Zhou Q, Abel A, et al. Probabilistic belief embedding for knowledge base completion. 2015,. arXiv Google Scholar

[111] Choi S J, Song H J, Yoon H G, et al. A re-ranking model for accurate knowledge base completion with knowledge base schema and web statistic. In: Proceedings of IEEE Congress on Evolutionary Computation, CEC 2016, Vancouver, 2016. 4958--4964. Google Scholar

[112] Huang W, Li G, Jin Z. Improved knowledge base completion by the path-augmented TransR model. In: Proceedings of International Conference on Knowledge Science, Engineering and Management. Berlin: Springer, 2017. 149--159. Google Scholar

[113] Guo S, Ding B, Wang Q, et al. Knowledge base completion via rule-enhanced relational learning. In: Proceedings of the 1st China Conference (CCKS 2016), Beijing, 2016. 219--227. Google Scholar

[114] Zhao Y, Gao S, Gallinari P. Knowledge base completion by learning pairwise-interaction differentiated embeddings. Data Min Knowl Disc, 2015, 29: 1486-1504 CrossRef Google Scholar

[115] Oh B, Seo S, Lee K. Knowledge graph completion by context-aware convolutional learning with multi-hop neighborhoods. In: Proceedings of Conference on Information and Knowledge Management, 2018. 257--266. Google Scholar

[116] Niu J, Sun Z, Zhang W. Enhancing knowledge graph completion with positive unlabeled learning. In: Proceedings of International Conference on Pattern Recognition, 2018. 296--301. Google Scholar

[117] Jenatton R, Roux N L, Bordes A, et al. A latent factor model for highly multi-relational data. In: Proceedings of Neural Information Processing Systems, 2012. 3167--3175. Google Scholar

[118] Du Z J, Hao Z H, Meng X F, et al. CirE: circular embeddings of knowledge graphs. In: Proceedings of International Conference on Database Systems for Advanced Applications, 2017. 148--162. Google Scholar

[119] Nguyen D Q, Vu T, Nguyen T D, et al. A capsule network-based embedding model for knowledge graph completion and search personalization. 2018,. arXiv Google Scholar

[120] Guan S, Jin X, Wang Y, et al. Shared embedding based neural networks for knowledge graph completion. In: Proceedings of Conference on Information and Knowledge Management, 2018. 247--256. Google Scholar

[121] He W, Feng Y, Zou L, et al. Knowledge base completion using matrix factorization. In: Proceedings of Asia-Pacific Web Conference, 2015. 256--267. Google Scholar

[122] Bing L D, Zhang Z M, Lam W, et al. Towards a language-independent solution: knowledge base completion by searching the web and deriving language pattern. Knowl Based Syst, 2016, 115: 80--86. Google Scholar

[123] West R B, Gabrilovich E, Murphy K, et al. Knowledge base completion via search-based question answering. In: Proceedings of International World Wide Web Conferences, 2014. 515--526. Google Scholar

[124] Angeli G, Manning C D. Philosophers are mortal: inferring the truth of unseen facts. In: Proceedings of Conference on Computational Natural Language Learning, 2013. 133--142. Google Scholar

[125] Li X, Taheri A, Tu L, et al. Commonsense knowledge base completion. In: Proceedings of Meeting of the Association for Computational Linguistics, 2016. 1445--1455. Google Scholar

[126] Luan S M, Dai G Z, Li W. A programmable approach to revising knowledge bases. Sci China Ser F-Inf Sci, 2005, 48: 681-692 CrossRef Google Scholar

  • Figure 1

    (Color online) KB construction process

  • Figure 2

    Conceptual hierarchy model of entity type

  • Figure 3

    Example for instance completion

  • Figure 4

    Classification of knowledge graph completion methods

  • Figure 5

    Statistical relational learning characteristics

  • Figure 6

    Tranlation models. (a) TransE; (b) TransH; (c) TransR; (d) TransD

  • Figure 7

    Time axis of the development of KB completion

  • Table 1   KB construction methods
    Construction method Schema (Y/N) Typical KB
    Artificial, experts Y OpenCyc, UMLS, WordNet
    Artificial, volunteers Y Wikidata, Freebase
    Automatic, semi-structured Y YAGO, DBPedia, Freebase
    Automatic, un-structured Y Knowledge Vault, NELL, PATTY, DeepDive
    Automatic, un-structured N ReVerb, OLLIE, PRISMATIC
  • Table 2   Comparison of main methods of SRL
    Comparativefactors Probabilisticrelational models Markov logicnetworks Relational Markov networks Bayes logicnetworks
    Model classhierarchy Unidirectionalgraph model Logical clause Bidirectional graph model Bipartite monograph model
    Parameter estimation Maximum likelihood estimation filling CPT Maximum likelihood estimation,learning weight Bayesrelational classifier,learning CPT Maximum likelihood estimation,filling CPT
    Structure learning Score-basedlearning ILP Conditionalrelation learner ILP
    Inference graph Bayes networks Markov networks Undirected model Bayes networks
    Inference method Belief propagation Quasi-likelihood estimation Quasi-likelihoodestimation Bayes networksinference
    Self-correlation Self-cycling in class hierarchical model Additional variables Yes Not involve
    Multi-relational processing Need integration No need No need Need to combine rules
  • Table 3   Comparison of main translation models
    Model name Evaluation function Optimization Time complexity Space complexity
    Unstructured $\left\|~{{e^h}~-~{e^t}}~\right\|_2^2$ SGD ${{O}}\left(~{{N_t}}~\right)$ $O\left(~{{N_e}m}~\right)$
    NTN ${r^l}^{\rm~T}{\rm~tanh}({e^{h{\rm~T}}}{M_r}{e^t}~+{W_{r,1}}{e^h}$$~+~{W_{r,2}}{e^t}~+~{b_r})$ L-BFGS $O((({m^2}~+~m)s~+$$~2mk~+~k){N_t})$ $O({N_e}m~+~{N_r}({n^2}s~+~$$2ns~+~2s))$
    SE ${\left\|~{{W_{r,1}}{e^h}~-~{W_{r,2}}{e^t}}~\right\|_2}$ SGD $O(2{m^2}{N_t})$ $O\left(~{{N_e}m~+~{N_r}}~\right)$
    SME ${({W_{1,1}}{e^h}~+~{W_{1,2}}{r^l}~+~{b_1})^{\rm~T}~\cdot}$$({W_{2,1}}{e^t}~+~{W_{2,2}}{r^l}~+~{b_2})$ SGD $O\left(~{4mk{N_t}}~\right)$ $O(~{N_e}m{\rm{~+~}}{{\rm{N}}_r}n~+$$~4mk~+~4k~)$
    RESCAL $\left\langle~{{e^h}|{R^l}|{e^t}}~\right\rangle$ SGD $O(~{mn{N_t}}~)$ $O({N_e}m~+~{N_r}{n^2})$
    TransE ${\left\|~{{e^h}~+~{r^l}~-~{e^t}}~\right\|_2}$ SGD $O\left(~{{N_t}}~\right)$ $O\left(~{{N_e}m~+~{N_r}n}~\right)$
    TransH $\left\|~{({e^h}~-~\left\langle~{{w^l}|{e^h}{w^l}}~\right\rangle~)~+~{r^l}~-~({e^t}~-~\left\langle~{{w^l}|{e^t}{w^l}}~\right\rangle~)}~\right\|_2^2$ SGD $O\left(~{2m{N_t}}~\right)$ $O\left(~{{N_e}m~+~2{N_r}n}~\right)$
    TransR ${\left\|~{\left\langle~{{e^h}|{M_l}}~\right\rangle~~+~{r^l}~-~\left\langle~{{e^t}|{M_l}}~\right\rangle~}~\right\|_2}$ SGD $O\left(~{2mn{N_t}}~\right)$ $O\left(~{{N_e}m~+~{N_r}\left(~{m~+~1}~\right)n}~\right)$
    CTransR ${\left\|~{\left\langle~{{e^h}|{M_l}}~\right\rangle~~+~{r^l}~-~\left\langle~{{e^t}|{M_l}}~\right\rangle~}~\right\|_2}$ SGD $O\left(~{2mn{N_t}}~\right)$ $O\left(~{{N_e}m~+~{N_r}\left(~{m~+~d}~\right)n}~\right)$
    TransD ${\left\|~{(I~+~{r_p}h_p^{\rm~T}){e^h}~+~{r^l}~-~(I~+~{r_p}t_p^{\rm~T}){e^t}}~\right\|_2}$ AdaGrad $O\left(~{2n{N_t}}~\right)$ $O\left(~{2{N_e}m~+~2{N_r}n}~\right)$
    TransA $-~{(|h~+~r~-~t|)^{\rm~T}}{M_r}(|h~+~r~-~t|)$ SGD $O({m^2})$ $O({N_e}m~+~{N_r}{m^2})$
    TranSparse ${\left\|~{W_r^h(\theta~_r^h){e^h}~+~{r^l}~-~W_r^h(\theta~_r^h){e^t}}~\right\|_2}$ SGD $O(~{2(~{1~-~\hat~\theta~}~)mn{N_t}}~)$ $O(~{N_e}m~+~$$~{N_r}(~{1~-~\hat~\theta~}~)\left(~{m~+~1}~\right)n~)$
    LFM $\left\langle {y|{R^l}|y'} \right\rangle + \left\langle {{e^h}|{R^l}|z} \right\rangle $\\[-3pt]$+ \left\langle {z|{R^l}|{e^t}} \right\rangle + \left\langle {{e^h}|{R^l}|{e^t}} \right\rangle $ SGD $O({N_e}m~+~{N_r}{n^2})$ $O({N_e}m~+~{N_r}n~+~10{n^2})$
    CirE ${\left\|~{{e^h}~+~{r^l}~-~{e^t}}~\right\|_2}$ SGD $O\left(~{m\log~m{N_t}}~\right)$ $O\left(~{{N_e}m~+~2{N_r}n}~\right)$

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号