BiLSTM-based Implicit Discourse Relation Classification Combining Self-attention<br/>Mechanism and Syntactic Information

Computer Science ›› 2019, Vol. 46 ›› Issue (5): 214-220.doi: 10.11896/j.issn.1002-137X.2019.05.033

Previous Articles     Next Articles

BiLSTM-based Implicit Discourse Relation Classification Combining Self-attention
Mechanism and Syntactic Information

FAN Zi-wei, ZHANG Min, LI Zheng-hua   

  1. (School of Computer Sciences and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Published:2019-05-15

Abstract: Implicit discourse relation classification is a sub-task in shallow discourse parsing,and it’s also an important task in natural language processing(NLP).Implicit discourse relation is a logic semantic relation inferred from the argument pairsin discourse relations.The analytical results of the implicit discourse relationship can be applied to many na-tural language processing tasks,such as machine translation,automatic document summarization,and questionanswe-ring system.This paper proposed a method based on self-attention mechanism and syntactic information for the classification task of implicit discourse relations.In this method,Bidirectional Long Short-Term Memory Network (BiLSTM) is used to model the inputted argument pairs with syntactic information and express the argument pairs into low-dimension dense vectors.The argument pair information was screened by the self-attention mechanism.At last,this paper conducted experiments on PDTB2.0 dataset.The experimental results show that the proposed model achieves better effects than the baseline system.

Key words: Implicit discourse relation classification, Neural network, Self-attention mechanism, Syntactic information

CLC Number: 

  • TP391
[1]POPESCU-BELIS A,MEYER T.Using Sense-Labeled Dis-course Connectives forStatistical Machine Translation[C]∥Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2012:129-138.
[2]JANSEN P,SURDEANU M,CLARK P.Discourse Complements Lexical Semanticsfor Non-factoid Answer Reranking[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:977-986.
[3]LOUIS A,JOSHI A,ENKOVA A.Discourse Indicators forContent Selectionin Summarization[C]∥Proceedings of the Special Interest Group on Discourse and Dialogue.Pennsylvania,USA:Association for Computational Linguistics,2010:147-156.
[4]PITLER E,NENKOVA A.Using Syntax to Disambiguate Explicit Discourse Connectives in Text[C]∥Proceedings of the ACL-IJCNLP 2009 Conference Short Papers.Pennsylvania.USA:Association for Computational Linguistics,2009:13-16.
[5]PRASAD R,DINESH N,LEE A,et al.The Penn DiscourseTreeBank 2.0[C]∥Proceedings of the International Conference on Language Resources and Evaluation.Paris,France:European Language Resources Association,2008:2961-2968.
[6]EDDY S.Hidden Markov models[J].Current Opinion in Structural Biology,1996,6(3):361-365.
[7]RATNAPARKHI A.A Maximum Entropy Model for Part-of-Speech Tagging[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Pennsylvania.USA:Association for Computational Linguistics,1996:133-142.
[8]COLLINS M.Discriminative Training Methods for HiddenMarkov Models:Theoryand Experiments with Perceptron Algorithms[C]∥Proceedings of the Annual Meeting of the Association for Computational Linguistics.Pennsylvania.USA:Associationfor Computational Linguistics,2002:1-8.
[9]CHANG C C,LIN C J.LIBSVM:A library for support vector machines[M].ACM,2011:1-27[10]LAFFERTY J,MCCALLUM A,PEREIRA F.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]∥Proceedingsof the International Conference on Machine Learning.Massachusetts,USA:TheInternational Machine Learning Society,2001:282-289.
[11]PITLER E,LOUIS A,NENKOVA A.Automatic Sense Prediction for Implicit Discourse Relations in Text[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2009:683-691.
[12]LIN Z H,KAN M Y,NG H T.Recognizing Implicit Discourse Relations inthe Penn Discourse Treebank[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2009:343-351.
[13]WANG W T,SU J,TAN C L.Kernel Based Discourse Relation Recognition with Temporal Ordering Information[C]∥Procee-dings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2010:710-719.
[14]RUTHERFORD A,XUE N W.Discovering Implicit Discourse Relations Through Brown Cluster pair Representation and Coreference Patterns[C]∥Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2014:645-654.
[15]QIN L H,ZHANG Z S,ZHAO H.Shallow Discourse Parsing Using Convolutional Neural Network[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:70-77.
[16]SCHENK N,CHIARCOS C,DONANDT K,et al.Do We Really Need All Those Rich Linguistic Features?A Neural Network-Based Approach to Implicit Sense Labeling[C]∥Proceedings of the Conference on Computational Natural Language Learning-Shared Task.Pennsyl-vania,USA:Association for Computatio-nal Linguistics,2016:41-49.
[17]WEISS G,BAJEC M.Discourse Sense Classification fromScratch using Focused RNNs[C]∥Proceedings of the Confe-rence on Computational Natural Language Learning-Shared Task.Pennsylvania,USA:Association for Computational Linguistics,2016:50-54.
[18]CHEN J F,ZHANG Q,LIU P F,et al.Implicit Discourseelation Detection via a Deep Architecture with Gated Relevance Network[C]∥Proceedings of the Association for Computational Linguistics.Pennsylvania,USA:Association for Computational Linguistics,2016:1726-1735.
[19]DOZAT T,MANNING C D.Deep Biaffine Attention for Neural Dependency Parsing[C]∥Proceedings of 5th International Conference on Learning Representations.2017:24-26.
[20]ZHANG B,SU J,XIONG D,et al.Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition[C]∥Proceedings of Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2015:2230-2235.
[21]RUTHERFORD A,XUE N.Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives[C]∥Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Pennsylvania,USA:Association for Computational Linguistics,2015:799-808.
[22]LIU Y,LI S.Recognizing Implicit Discourse Relations via Re-peated Reading:Neural Networks with Multi-Level Attention[C]∥Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2016:1224-1233.
[23]LIU Y,LI S,ZHANG X,et al.Implicit discourse relation classification via multi-task neural networks[C]∥Thirtieth AAAI Conference on Artificial Intelligence.USA:AAAI Press,2016:2750-2756.
[24]LAN M,WANG J,WU Y,et al.Multi-task Attention-basedNeural Networks for Implicit Discourse Relationship Representation and Identification[C]∥Proceedings of the 2017 Confe-rence on Empirical Methods in Natural Language Processing.Pennsylvania,USA:Association for Computational Linguistics,2017:1299-1308.
[1] NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[2] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[5] WANG Run-an, ZOU Zhao-nian. Query Performance Prediction Based on Physical Operation-level Models [J]. Computer Science, 2022, 49(8): 49-55.
[6] CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[7] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[8] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[9] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[10] PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247.
[11] ZHAO Dong-mei, WU Ya-xing, ZHANG Hong-bin. Network Security Situation Prediction Based on IPSO-BiLSTM [J]. Computer Science, 2022, 49(7): 357-362.
[12] QI Xiu-xiu, WANG Jia-hao, LI Wen-xiong, ZHOU Fan. Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning [J]. Computer Science, 2022, 49(7): 18-24.
[13] YANG Bing-xin, GUO Yan-rong, HAO Shi-jie, Hong Ri-chang. Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition [J]. Computer Science, 2022, 49(7): 57-63.
[14] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[15] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!