A Language-Independent Neural Network for Event Detection

ACL, 2016.

Cited by: 118|Bibtex|Views42|Links
EI
Keywords:
nature language processing event detection neural networks representation learning
Wei bo:
We introduce a hybrid neural network model, which incorporates both bidirectional LSTMs and convolutional neural networks to capture sequence and structure semantic information from specific contexts, for event detection

Abstract:

Event detection remains a challenge because of the difficulty of encoding the word semantics in various contexts. Previous approaches have heavily depended on language-specific knowledge and preexisting natural language processing tools. However, not all languages have such resources and tools available compared with English language. A m...More

Code:

Data:

Introduction
  • Event detection aims to extract event triggers and classify them into specific types precisely.
  • It is a crucial and quite challenging sub-task of event extraction, because the same event might appear in the form of various trigger expressions and an expression might represent different event types in different contexts.
  • TransferMoney det nn DT NNP NNP.
  • Organization nsubj aux prep drobj num num.
  • ReleaseParole det nsubj DT NN VBZ
  • Most of previous methods (Ji et al, 2008; Liao et al, 2010; Hong et al, 2011; Li et al, 2013; Li et al, 2015b) considered event detection as a classi-
Highlights
  • Event detection aims to extract event triggers and classify them into specific types precisely
  • We develop a hybrid neural network incorporating two types of neural networks: Bi-directional long short-term memory and Convolutional neural network, to model both sequence and chunk information from specific contexts
  • We introduce a hybrid neural networks, which combines Bi-directional LSTM (BiLSTM) and convolutional neural networks to learn a continuous representation for each word in a sentence
  • HNN approach performed better than LSTM and Bi-directional long short-term memory. It indicates that our proposed model could achieve the best performance in multiple languages than other neural network methods
  • We introduce a hybrid neural network model, which incorporates both bidirectional LSTMs and convolutional neural networks to capture sequence and structure semantic information from specific contexts, for event detection
  • We find that bi-directional LSTM is powerful for trigger extraction in capturing preceding and following contexts in long distance
Methods
  • We compare our approach with the following baseline methods.

    (1) MaxEnt, a basesline feature-based method, which trains a Maximum Entropy classifier with some lexical and syntactic features (Ji et al, 2008).

    (2) Cross-Event (Liao et al, 2010), using document-level information to improve the performance of ACE event extraction.

    (3) Cross-Entity (Hong et al, 2011), extracting events using cross-entity inference.

    (4) Joint Model (Li and Ji, 2014), a joint structured perception approach, incorporating multilevel linguistic features to extract event triggers and arguments at the same time so that local predictions can be mutually improved.
  • We compare our approach with the following baseline methods.
  • (1) MaxEnt, a basesline feature-based method, which trains a Maximum Entropy classifier with some lexical and syntactic features (Ji et al, 2008).
  • (3) Cross-Entity (Hong et al, 2011), extracting events using cross-entity inference.
  • (4) Joint Model (Li and Ji, 2014), a joint structured perception approach, incorporating multilevel linguistic features to extract event triggers and arguments at the same time so that local predictions can be mutually improved.
Results
  • In English event detection task, our approach achieved 73.4% F-score with average 3.0% absolute improvement compared to state-of-the-art.
  • HNN approach performed better than LSTM and Bi-LSTM.
  • It indicates that our proposed model could achieve the best performance in multiple languages than other neural network methods
Conclusion
  • We introduce a hybrid neural network model, which incorporates both bidirectional LSTMs and convolutional neural networks to capture sequence and structure semantic information from specific contexts, for event detection.
  • Compared with traditional event detection methods, our approach does not rely on any linguistic resources, can be applied to any languages.
  • We conduct experiments on various languages ( English, Chinese and Spanish.
  • Empirical results show our approach achieved state-of-theart performance in English and competitive results in Chinese.
  • We find that bi-directional LSTM is powerful for trigger extraction in capturing preceding and following contexts in long distance
Summary
  • Introduction:

    Event detection aims to extract event triggers and classify them into specific types precisely.
  • It is a crucial and quite challenging sub-task of event extraction, because the same event might appear in the form of various trigger expressions and an expression might represent different event types in different contexts.
  • TransferMoney det nn DT NNP NNP.
  • Organization nsubj aux prep drobj num num.
  • ReleaseParole det nsubj DT NN VBZ
  • Most of previous methods (Ji et al, 2008; Liao et al, 2010; Hong et al, 2011; Li et al, 2013; Li et al, 2015b) considered event detection as a classi-
  • Methods:

    We compare our approach with the following baseline methods.

    (1) MaxEnt, a basesline feature-based method, which trains a Maximum Entropy classifier with some lexical and syntactic features (Ji et al, 2008).

    (2) Cross-Event (Liao et al, 2010), using document-level information to improve the performance of ACE event extraction.

    (3) Cross-Entity (Hong et al, 2011), extracting events using cross-entity inference.

    (4) Joint Model (Li and Ji, 2014), a joint structured perception approach, incorporating multilevel linguistic features to extract event triggers and arguments at the same time so that local predictions can be mutually improved.
  • We compare our approach with the following baseline methods.
  • (1) MaxEnt, a basesline feature-based method, which trains a Maximum Entropy classifier with some lexical and syntactic features (Ji et al, 2008).
  • (3) Cross-Entity (Hong et al, 2011), extracting events using cross-entity inference.
  • (4) Joint Model (Li and Ji, 2014), a joint structured perception approach, incorporating multilevel linguistic features to extract event triggers and arguments at the same time so that local predictions can be mutually improved.
  • Results:

    In English event detection task, our approach achieved 73.4% F-score with average 3.0% absolute improvement compared to state-of-the-art.
  • HNN approach performed better than LSTM and Bi-LSTM.
  • It indicates that our proposed model could achieve the best performance in multiple languages than other neural network methods
  • Conclusion:

    We introduce a hybrid neural network model, which incorporates both bidirectional LSTMs and convolutional neural networks to capture sequence and structure semantic information from specific contexts, for event detection.
  • Compared with traditional event detection methods, our approach does not rely on any linguistic resources, can be applied to any languages.
  • We conduct experiments on various languages ( English, Chinese and Spanish.
  • Empirical results show our approach achieved state-of-theart performance in English and competitive results in Chinese.
  • We find that bi-directional LSTM is powerful for trigger extraction in capturing preceding and following contexts in long distance
Tables
  • Table1: Hyperparameters and # of documents used in our experiments on three languages
  • Table2: Comparison of different methods on English event detection
  • Table3: Results on Chinese event detection
  • Table4: Results on Spanish event detection
Download tables as Excel
Related work
  • Event detection is a fundamental problem in information extraction and natural language processing (Li et al, 2013; Chen et al, 2015), which aims at detecting the event trigger of a sentence (Ji et al, 2008). The majority of existing methods regard this problem as a classification task, and use machine learning methods with hand-crafted features, such as lexical features (e.g., full word, pos tag), syntactic features (e.g., dependency features) and external knowledge features (WordNet). There also exists some studies leveraging richer evidences like cross-document (Ji et al, 2008), cross-entity (Hong et al, 2011) and joint inference (Li and Ji, 2014).

    Despite the effectiveness of feature-based methods, we argue that manually designing feature templates is typically labor intensive. Besides, feature engineering requires expert knowledge and rich external resources, which is not always available for some low-resource languages. Furthermore, a desirable approach should have the ability to automatically learn informative representations from data, so that it could be easily adapted to different languages. Recently, neural network emerges as a powerful way to learn text representation automatically from data and has obtained promising performances in a variety of NLP tasks.
Funding
  • RPI co-authors were supported by the U.S DARPA LORELEI Program No HR0011-15-C0115, DARPA DEFT Program No FA8750-13-20041 and NSF CAREER Award IIS-1523198
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    Findings
  • Chen Chen, V Incent Ng, and et al. 201Joint modeling for chinese event extraction with rich linguistic features. In In COLING. Citeseer.
    Google ScholarLocate open access versionFindings
  • Yubo Chen, Liheng Xu, Kang Liu, Daojian Zeng, and Jun Zhao. 2015. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, volume 1, pages 167–176.
    Google ScholarLocate open access versionFindings
  • Yu Hong, Jianfeng Zhang, Bin Ma, Jianmin Yao, Guodong Zhou, and Qiaoming Zhu. 2011. Using cross-entity inference to improve event extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 1127– 1136. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Heng Ji, Ralph Grishman, and et al. 2008. Refining event extraction through cross-document inference. In ACL, pages 254–262.
    Google ScholarLocate open access versionFindings
  • Yann LeCun, Yoshua Bengio, and et al. 1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10).
    Google ScholarLocate open access versionFindings
  • Qi Li and Heng Ji. 2014. Incremental joint extraction of entity mentions and relations. In Proceedings of the Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Qi Li, Heng Ji, and Liang Huang. 2013. Joint event extraction via structured prediction with global features. In ACL (1), pages 73–82.
    Google ScholarLocate open access versionFindings
  • Jiwei Li, Dan Jurafsky, and Eudard Hovy. 2015a.
    Google ScholarFindings
  • Jiwei Li, Minh-Thang Luong, and Dan Jurafsky. 2015b. A hierarchical neural autoencoder for paragraphs and documents. arXiv preprint arXiv:1506.01057.
    Findings
  • Shasha Liao, Ralph Grishman, and et al. 2010. Using document level cross-event inference to improve event extraction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 789–797. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Ting Liu, Wanxiang Che, and Zhenghua Li. 2011. Language technology platform. Journal of Chinese Information Processing, 25(6):53–62.
    Google ScholarLocate open access versionFindings
  • Yang Liu, Furu Wei, Sujian Li, Heng Ji, Ming Zhou, and Houfeng Wang. 2015. A dependency-based neural network for relation classification. arXiv preprint arXiv:1507.04646.
    Findings
  • Fan Miao and Ralph Grishman. 2015. Improving event detection with active learning. In EMNLP.
    Google ScholarFindings
  • Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan Cernocky, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In INTERSPEECH, volume 2, page 3.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119.
    Google ScholarLocate open access versionFindings
  • Thien Huu Nguyen and Ralph Grishman. 2015. Event detection and domain adaptation with convolutional neural networks. Volume 2: Short Papers, page 365.
    Google ScholarLocate open access versionFindings
  • Mike Schuster, Kuldip K Paliwal, and et al. 1997. Bidirectional recurrent neural networks. Signal Processing, IEEE Transactions on, 45(11):2673–2681.
    Google ScholarLocate open access versionFindings
  • Hristo Tanev, Vanni Zavarella, Jens Linge, Mijail Kabadjov, Jakub Piskorski, Martin Atkinson, and Ralf Steinberger. 2009. Exploiting machine learning techniques to build an event extraction system for portuguese and spanish. Linguamatica, 1(2):55– 66.
    Google ScholarLocate open access versionFindings
  • Duyu Tang, Bing Qin, and Ting Liu. 2015a. Document modeling with gated recurrent neural network for sentiment classification. EMNLP.
    Google ScholarLocate open access versionFindings
  • Duyu Tang, Bing Qin, and Ting Liu. 2015b. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1422–1432.
    Google ScholarLocate open access versionFindings
  • Matthew D Zeiler. 2012. Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
    Findings
  • Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, Jun Zhao, et al. 2014. Relation classification via convolutional deep neural network. In COLING, pages 2335–2344.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments