logo

SCIENCE CHINA Information Sciences, Volume 61, Issue 9: 092106(2018) https://doi.org/10.1007/s11432-017-9359-x

A language-independent neural network for event detection

More info
  • ReceivedOct 16, 2017
  • AcceptedFeb 7, 2018
  • PublishedAug 3, 2018

Abstract

Event detection remains a challenge because of the difficulty of encoding the word semantics in various contexts.Previous approaches have heavily depended on language-specific knowledge and pre-existing natural language processing tools.However, not all languages have such resources and tools available compared with English language.A more promising approach is to automatically learn effective features from data, without relying on language-specific resources.In this study, we develop a language-independent neural network tocapture both sequence and chunk information from specific contexts and use them to train an event detector for multiple languages without any manually encoded features.Experiments show that our approach can achieve robust, efficient and accurate results for various languages.In the ACE 2005 English event detection task, our approach achieved a 73.4% F-score with an average of 3.0% absolute improvement compared with state-of-the-art.Additionally, our experimental results are competitive for Chinese and Spanish.


Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant Nos. 61632011, 61772156, 61702137).


References

[1] Jurafsky D, Martin J H. Speech & Language Processing. London: Pearson Education India, 2000. Google Scholar

[2] Manning C D. Foundations of Statistical Natural Language Processing. Cambridge: MIT Press, 1999. Google Scholar

[3] Gao Y, Zhang H W, Zhao X B, et al. Event classification in microblogs via social tracking. ACM Trans Intel Syst Technol, 2017, 8: 35. Google Scholar

[4] Zhao S, Gao Y, Ding G. Real-Time Multimedia Social Event Detection in Microblog.. IEEE Trans Cybern, 2017, : 1-14 CrossRef PubMed Google Scholar

[5] Nguyen T H, Grishman R. Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015. 365--371. Google Scholar

[6] Peng H R, Song Y Q, Roth D. Event detection and co-reference with minimal supervision. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, 2016. 392--402. Google Scholar

[7] Wang Z Q, Zhang Y. A neural model for joint event detection and summarization. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Copenhagen, 2017. Google Scholar

[8] Hong Y, Zhang J F, Ma B, et al. Using cross-entity inference to improve event extraction. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, 2011. Google Scholar

[9] Ji H, Grishman R. Refining event extraction through cross-document inference. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) with the Human Language Technology Conference, Columbus, 2008. 254--262. Google Scholar

[10] Li J W, Luong M, Jurafsky D. A hierarchical neural autoencoder for paragraphs and documents. 2015,. arXiv Google Scholar

[11] Li Q, Ji H, Huang L. Joint event extraction via structured prediction with global features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, 2013. 73--82. Google Scholar

[12] Liao S S, Grishman R. Using document level cross-event inference to improve event extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, 2010. 789--797. Google Scholar

[13] Dzmitry B, Kyunghyun C, Yoshua B. Neural machine translation by jointly learning to align and translate. 2014,. arXiv Google Scholar

[14] Feng X C, Tang D Y, Qin B, et al. English-Chinese Knowledge base Translation with Neural Network. In: Proceedins of the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, 2016. 2935--2944. Google Scholar

[15] Feng X C, Guo J, Qin B, et al. Effective deep memory networks for distant supervised relation extraction. In: Proceeding of the 26th International Joint Conference on Artificial Intelligence, Melbourne, 2017. 4002--4008. Google Scholar

[16] Zeng D J, Liu K, Lai S W, et al. Relation classification via convolutional deep neural network. In: Proceedings of the 25th International Conference on Computational Linguistics, Dublin, 2014. 2335--2344. Google Scholar

[17] Tang D Y, Qin B, Liu T. Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015. 1422--1432. Google Scholar

[18] Harris Z S. Distributional structure. Word, 1954, 10: 146-162 CrossRef Google Scholar

[19] Feng X C, Huang L F, Tang D Y, et al. A language-independent neural network for event detection. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, 2016. 66--71. Google Scholar

[20] Chen Y B, Xu L H, Liu K, et al. Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015. 167--176. Google Scholar

[21] David A. The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning about Time and Events, Sydney, 2006. Google Scholar

[22] Li Q, Ji H. Incremental joint extraction of entity mentions and relations. In: Proceedings of the Association for Computational Linguistics, Baltimore, 2014. 402--412. Google Scholar

[23] McClosky D, Surdeanu M, Manning C D. Event extraction as dependency parsing. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, 2011. 1626--1635. Google Scholar

[24] Goyal K, Jauhar S K, Li H Y, et al. A structured distributional semantic model for event co-reference. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, 2013. 467--473. Google Scholar

[25] Li J W, Jurafsky D, Hovy E. When are tree structures necessary for deep learning of representations? 2015,. arXiv Google Scholar

[26] Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Berlin: Springer, 2012. Google Scholar

[27] Cao K, Li X, Fan M, et al. Improving event detection with active learning. In: Proceedings of Recent Advances in Natural Language Processing, Hissar, 2015. 72--77. Google Scholar

[28] Baroni M, Dinu G, Georgiana K. Dont count, predict a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, 2014. 238--247. Google Scholar

[29] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, 2013. Google Scholar

[30] Hochreiter S, Schmidhuber J. LSTM can solve hard long time lag problems. In: Proceedings of Conference on Neural Information Processing Systems, Denver, 1997. 473--479. Google Scholar

[31] Zaremba W, Sutskever I, Vinyals O. Recurrent neural network regularization. 2014,. arXiv Google Scholar

[32] Liu Y, Wei F R, Li S J, et al. A dependency-based neural network for relation classification. 2015,. arXiv Google Scholar

[33] Zeiler M D. ADADELTA: an adaptive learning rate method. 2012,. arXiv Google Scholar

[34] Chen C, Ng V. Joint modeling for chinese event extraction with rich linguistic features. Citeseer, 2012, 290: 529--544. Google Scholar

[35] Chen Z, Ji H. Language specific issue and feature exploration in Chinese event extraction. In: Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, 2009. 209--212. Google Scholar

[36] Liu S L, Liu K, He S Z, et al. A probabilistic soft logic based approach to exploiting latent and global information in event classification. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, 2016. 2993--2999. Google Scholar

[37] Liu S L, Chen Y B, He S Z, et al. Leveraging framenet to improve automatic event detection. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, 2016. 2134--2143. Google Scholar

[38] Liu S L, Chen Y B, Liu K, et al. Exploiting argument information to improve event detection via supervised attention mechanisms. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 2017. 1789--1798. Google Scholar

[39] Liu T, Che W X, Li Z H. Language technology platform. J Chinese Inf Proc, 2011, 25: 53--62. Google Scholar

[40] Tanev H, Zavarella V, Linge J, et al. Exploiting machine learning techniques to build an event extraction system for portuguese and spanish. Linguamática, 2009, 1: 55--66. Google Scholar

  • Figure 1

    (Color online) Event type and syntactic parser results of an example sentence.

  • Figure 2

    (Color online) Illustration of our model for the event trigger extraction (the trigger candidate here is “release”). ${\boldsymbol~F}_v$ and ${\boldsymbol~B}_v$ are the output of Bi-LSTM, while ${\boldsymbol~C}_2$ and ${\boldsymbol~C}_3$ are the output of CNN with convolutional filters with widths of $2$ and $3$, respectively.

  • Figure 3

    LSTM cell.

  • Figure 4

    (Color online) CNN structure.

  • Figure 5

    Comparison of the three languages.

  • Table 1   Top 8 most similar words (in three clusters)
    Injure Score Fight Score Fire Score
    Injures 0.602 Fighting 0.792 Fires 0.686
    Hurt 0.593 Fights 0.762 Aim 0.683
    Harm 0.592 Battle 0.702 Enemy 0.601
    Maim 0.571 Fought 0.636 Grenades 0.597
    Injuring 0.561 Fight 0.610 Bombs 0.585
    Endanger 0.543 Battles 0.590 Blast 0.566
    Dislocate 0.529 Fighting 0.588 Burning 0.562
    Kill 0.527 Bout 0.570 Smoke 0.558
  • Table 2   Hyperparameters used in our experiments on three languages
    LanguageWord embeddingGradient learning method
    Embedding corpus Embedding dimension Learning method Parameters
    EnglishNYT 300 SGD learning rate $r~=0.03$
    ChineseGigword 300 Adadelta $p~=0.95$, $~\delta~=~1{\rm~e}^{-6}$
    SpanishGigword 300 Adadelta $p~=0.95$, $~\delta~=~1{\rm~e}^{-6}~$
  • Table 3   # of documents
    Data setEnglish ACE2005Chinese ACE2005 Spanish ERE
    Train set 529 513 93
    Dev set 30 60 12
    Test set 40 60 12
  • Table 4   Comparison of different methods on the English event detection
    ModelTrigger identificationTrigger classification
    Precision Recall F-score Precision Recall F-score
    MaxEnt76.2 60.5 67.4 74.5 59.1 65.9
    Cross-event N/A N/A N/A 68.7 68.9 68.8
    Cross-entityN/A N/A N/A 72.9 64.3 68.3
    Joint model76.9 65.0 70.4 73.7 62.3 67.5
    PSLN/A N/A N/A 75.3 64.4 69.4
    PR N/AN/A N/A 68.9 72.0 70.4
    CNN 80.4 67.7 73.5 75.6 63.6 69.1
    RNN 73.2 63.5 67.4 67.3 59.9 64.2
    LSTM 78.6 67.4 72.6 74.5 60.7 66.9
    Bi-LSTM80.1 69.4 74.3 81.6 62.3 70.6
    FNN/A N/A N/A 77.6 65.2 70.7
    ANNN/A N/A N/A 76.8 67.5 71.9
    HNN 80.8 71.5 75.9 84.6 64.9 73.4
  • Table 5   Case study for English event detection
    English sentence examples Li [11]Chen [20] Our method
    Davies is leaving (end-position) to become chairman of Missingerror Classification error Correct
    the London school of economics, one of the best-known
    parts of the University of London.
    Palestinian security forces returned Monday to the positions Missing error CorrectCorrect
    they held in the Gaza Strip before the outbreak of the
    33-month Palestinian uprising (attack) as Israel removed
    all major checkpoints in the coastal territory, a Palestinian
    security source said.
    U.S. and British troops were moving on the strategic southern Missing error Missing errorCorrect
    port city of Basra Saturday after a massive aerial assault
    pounded (attack) Baghdad at dawn.
    Thousands of Iraq's majority Shiite Muslims marched Classification error Correct Correct
    (transport) to their main mosque in Baghdad to
    mark the birthday of Islam's founder Prophet Mohammed.
  • Table 6   Results on the Chinese event detection
    *
    - 0.3cmModel Trigger identificationTrigger classification
    Precision Recall F-score Precision Recall F-score
    MaxEnt 50.0 77.0 60.6 47.5 73.1 57.6
    Rich-C62.2 71.9 66.7 58.9 68.1 63.2
    HNN 74.2 63.1 68.2 77.1 53.1 63.0
  • Table 7   Results on the Spanish event detection
    ModelTrigger identificationTrigger classification
    Precision Recall F-score Precision Recall F-score
    LSTM62.2 52.9 57.2 56.9 32.6 41.6
    Bi-LSTM76.2 63.1 68.7 61.5 42.2 50.1
    HNN 81.4 65.2 71.6 66.3 47.8 55.5

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1