SCIENCE CHINA Information Sciences, Volume 61, Issue 9: 092106(2018) https://doi.org/10.1007/s11432-017-9359-x

## A language-independent neural network for event detection

• AcceptedFeb 7, 2018
• PublishedAug 3, 2018
Share
Rating

### Abstract

Event detection remains a challenge because of the difficulty of encoding the word semantics in various contexts.Previous approaches have heavily depended on language-specific knowledge and pre-existing natural language processing tools.However, not all languages have such resources and tools available compared with English language.A more promising approach is to automatically learn effective features from data, without relying on language-specific resources.In this study, we develop a language-independent neural network tocapture both sequence and chunk information from specific contexts and use them to train an event detector for multiple languages without any manually encoded features.Experiments show that our approach can achieve robust, efficient and accurate results for various languages.In the ACE 2005 English event detection task, our approach achieved a 73.4% F-score with an average of 3.0% absolute improvement compared with state-of-the-art.Additionally, our experimental results are competitive for Chinese and Spanish.

### Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant Nos. 61632011, 61772156, 61702137).

### References

[1] Jurafsky D, Martin J H. Speech & Language Processing. London: Pearson Education India, 2000. Google Scholar

[2] Manning C D. Foundations of Statistical Natural Language Processing. Cambridge: MIT Press, 1999. Google Scholar

[3] Gao Y, Zhang H W, Zhao X B, et al. Event classification in microblogs via social tracking. ACM Trans Intel Syst Technol, 2017, 8: 35. Google Scholar

[4] Zhao S, Gao Y, Ding G. Real-Time Multimedia Social Event Detection in Microblog.. IEEE Trans Cybern, 2017, : 1-14 CrossRef PubMed Google Scholar

[5] Nguyen T H, Grishman R. Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015. 365--371. Google Scholar

[6] Peng H R, Song Y Q, Roth D. Event detection and co-reference with minimal supervision. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, 2016. 392--402. Google Scholar

[7] Wang Z Q, Zhang Y. A neural model for joint event detection and summarization. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Copenhagen, 2017. Google Scholar

[8] Hong Y, Zhang J F, Ma B, et al. Using cross-entity inference to improve event extraction. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, 2011. Google Scholar

[9] Ji H, Grishman R. Refining event extraction through cross-document inference. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) with the Human Language Technology Conference, Columbus, 2008. 254--262. Google Scholar

[10] Li J W, Luong M, Jurafsky D. A hierarchical neural autoencoder for paragraphs and documents. 2015,. arXiv Google Scholar

[11] Li Q, Ji H, Huang L. Joint event extraction via structured prediction with global features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, 2013. 73--82. Google Scholar

[12] Liao S S, Grishman R. Using document level cross-event inference to improve event extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, 2010. 789--797. Google Scholar

[13] Dzmitry B, Kyunghyun C, Yoshua B. Neural machine translation by jointly learning to align and translate. 2014,. arXiv Google Scholar

[14] Feng X C, Tang D Y, Qin B, et al. English-Chinese Knowledge base Translation with Neural Network. In: Proceedins of the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, 2016. 2935--2944. Google Scholar

[15] Feng X C, Guo J, Qin B, et al. Effective deep memory networks for distant supervised relation extraction. In: Proceeding of the 26th International Joint Conference on Artificial Intelligence, Melbourne, 2017. 4002--4008. Google Scholar

[16] Zeng D J, Liu K, Lai S W, et al. Relation classification via convolutional deep neural network. In: Proceedings of the 25th International Conference on Computational Linguistics, Dublin, 2014. 2335--2344. Google Scholar

[17] Tang D Y, Qin B, Liu T. Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015. 1422--1432. Google Scholar

[18] Harris Z S. Distributional structure. Word, 1954, 10: 146-162 CrossRef Google Scholar

[19] Feng X C, Huang L F, Tang D Y, et al. A language-independent neural network for event detection. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, 2016. 66--71. Google Scholar

[20] Chen Y B, Xu L H, Liu K, et al. Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015. 167--176. Google Scholar

[21] David A. The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning about Time and Events, Sydney, 2006. Google Scholar

[22] Li Q, Ji H. Incremental joint extraction of entity mentions and relations. In: Proceedings of the Association for Computational Linguistics, Baltimore, 2014. 402--412. Google Scholar

[23] McClosky D, Surdeanu M, Manning C D. Event extraction as dependency parsing. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, 2011. 1626--1635. Google Scholar

[24] Goyal K, Jauhar S K, Li H Y, et al. A structured distributional semantic model for event co-reference. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, 2013. 467--473. Google Scholar

[25] Li J W, Jurafsky D, Hovy E. When are tree structures necessary for deep learning of representations? 2015,. arXiv Google Scholar

[26] Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Berlin: Springer, 2012. Google Scholar

[27] Cao K, Li X, Fan M, et al. Improving event detection with active learning. In: Proceedings of Recent Advances in Natural Language Processing, Hissar, 2015. 72--77. Google Scholar

[28] Baroni M, Dinu G, Georgiana K. Dont count, predict a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, 2014. 238--247. Google Scholar

[29] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, 2013. Google Scholar

[30] Hochreiter S, Schmidhuber J. LSTM can solve hard long time lag problems. In: Proceedings of Conference on Neural Information Processing Systems, Denver, 1997. 473--479. Google Scholar

[31] Zaremba W, Sutskever I, Vinyals O. Recurrent neural network regularization. 2014,. arXiv Google Scholar

[32] Liu Y, Wei F R, Li S J, et al. A dependency-based neural network for relation classification. 2015,. arXiv Google Scholar

[34] Chen C, Ng V. Joint modeling for chinese event extraction with rich linguistic features. Citeseer, 2012, 290: 529--544. Google Scholar

[35] Chen Z, Ji H. Language specific issue and feature exploration in Chinese event extraction. In: Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, 2009. 209--212. Google Scholar

[36] Liu S L, Liu K, He S Z, et al. A probabilistic soft logic based approach to exploiting latent and global information in event classification. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, 2016. 2993--2999. Google Scholar

[37] Liu S L, Chen Y B, He S Z, et al. Leveraging framenet to improve automatic event detection. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, 2016. 2134--2143. Google Scholar

[38] Liu S L, Chen Y B, Liu K, et al. Exploiting argument information to improve event detection via supervised attention mechanisms. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, 2017. 1789--1798. Google Scholar

[39] Liu T, Che W X, Li Z H. Language technology platform. J Chinese Inf Proc, 2011, 25: 53--62. Google Scholar

[40] Tanev H, Zavarella V, Linge J, et al. Exploiting machine learning techniques to build an event extraction system for portuguese and spanish. Linguamática, 2009, 1: 55--66. Google Scholar

• Figure 1

(Color online) Event type and syntactic parser results of an example sentence.

• Figure 2

(Color online) Illustration of our model for the event trigger extraction (the trigger candidate here is “release”). ${\boldsymbol~F}_v$ and ${\boldsymbol~B}_v$ are the output of Bi-LSTM, while ${\boldsymbol~C}_2$ and ${\boldsymbol~C}_3$ are the output of CNN with convolutional filters with widths of $2$ and $3$, respectively.

• Figure 3

LSTM cell.

• Figure 4

(Color online) CNN structure.

• Figure 5

Comparison of the three languages.

• Table 1   Top 8 most similar words (in three clusters)
 Injure Score Fight Score Fire Score Injures 0.602 Fighting 0.792 Fires 0.686 Hurt 0.593 Fights 0.762 Aim 0.683 Harm 0.592 Battle 0.702 Enemy 0.601 Maim 0.571 Fought 0.636 Grenades 0.597 Injuring 0.561 Fight 0.610 Bombs 0.585 Endanger 0.543 Battles 0.590 Blast 0.566 Dislocate 0.529 Fighting 0.588 Burning 0.562 Kill 0.527 Bout 0.570 Smoke 0.558
• Table 2   Hyperparameters used in our experiments on three languages
 Language Word embedding Gradient learning method Embedding corpus Embedding dimension Learning method Parameters English NYT 300 SGD learning rate $r~=0.03$ Chinese Gigword 300 Adadelta $p~=0.95$, $~\delta~=~1{\rm~e}^{-6}$ Spanish Gigword 300 Adadelta $p~=0.95$, $~\delta~=~1{\rm~e}^{-6}~$
• Table 3   # of documents
 Data set English ACE2005 Chinese ACE2005 Spanish ERE Train set 529 513 93 Dev set 30 60 12 Test set 40 60 12
• Table 4   Comparison of different methods on the English event detection
 Model Trigger identification Trigger classification Precision Recall F-score Precision Recall F-score MaxEnt 76.2 60.5 67.4 74.5 59.1 65.9 Cross-event N/A N/A N/A 68.7 68.9 68.8 Cross-entity N/A N/A N/A 72.9 64.3 68.3 Joint model 76.9 65.0 70.4 73.7 62.3 67.5 PSL N/A N/A N/A 75.3 64.4 69.4 PR N/A N/A N/A 68.9 72.0 70.4 CNN 80.4 67.7 73.5 75.6 63.6 69.1 RNN 73.2 63.5 67.4 67.3 59.9 64.2 LSTM 78.6 67.4 72.6 74.5 60.7 66.9 Bi-LSTM 80.1 69.4 74.3 81.6 62.3 70.6 FN N/A N/A N/A 77.6 65.2 70.7 ANN N/A N/A N/A 76.8 67.5 71.9 HNN 80.8 71.5 75.9 84.6 64.9 73.4
• Table 5   Case study for English event detection
 English sentence examples Li [11] Chen [20] Our method Davies is leaving (end-position) to become chairman of Missingerror Classification error Correct the London school of economics, one of the best-known parts of the University of London. Palestinian security forces returned Monday to the positions Missing error Correct Correct they held in the Gaza Strip before the outbreak of the 33-month Palestinian uprising (attack) as Israel removed all major checkpoints in the coastal territory, a Palestinian security source said. U.S. and British troops were moving on the strategic southern Missing error Missing error Correct port city of Basra Saturday after a massive aerial assault pounded (attack) Baghdad at dawn. Thousands of Iraq's majority Shiite Muslims marched Classification error Correct Correct (transport) to their main mosque in Baghdad to mark the birthday of Islam's founder Prophet Mohammed.
• Table 6   Results on the Chinese event detection
 * - 0.3cmModel Trigger identification Trigger classification Precision Recall F-score Precision Recall F-score MaxEnt 50.0 77.0 60.6 47.5 73.1 57.6 Rich-C 62.2 71.9 66.7 58.9 68.1 63.2 HNN 74.2 63.1 68.2 77.1 53.1 63.0
• Table 7   Results on the Spanish event detection
 Model Trigger identification Trigger classification Precision Recall F-score Precision Recall F-score LSTM 62.2 52.9 57.2 56.9 32.6 41.6 Bi-LSTM 76.2 63.1 68.7 61.5 42.2 50.1 HNN 81.4 65.2 71.6 66.3 47.8 55.5
• #### 8

Citations

• Altmetric

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有