研究生: |
黃劭崴 Huang, Shao-Wei |
---|---|
論文名稱: |
於有限標示資料下可擴展關係擷取之學習策略 Training Strategies for Extendable Relation Extraction under Limited Labeled Data |
指導教授: |
柯佳伶
Koh, Jia-Ling |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 69 |
中文關鍵詞: | 關係擷取 、自然語言處理 、遷移式學習 |
DOI URL: | http://doi.org/10.6345/NTNU202001148 |
論文種類: | 學術論文 |
相關次數: | 點閱:119 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文以國中生物課本文本內容作為語料庫來源,研究如何從有限標示的中文文本資料中進行關係擷取。本論文將此問題分成三元詞組偵測及語意關係分群及分類兩個處理任務,我們對三元詞組偵測任務提出結合關係詞類標籤模型及句型分類模型,並搭配遷移式學習、句型分類微調、及條件隨機域預測的學習策略,輸出句子中可能包含的三元詞組;對語意關係分群及分類任務則提出兩階段分群演算法,找出三元詞組中語意關係相似的聚落,並搭配半監督式學習的策略指定聚落的關係類別,達到可擴展關係擷取的目的。本論文實驗顯示:採用 BERT模型加上各元件及學習策略時,可讓原模型達到更好的標籤預測效果,另外所提出之兩階段分群演算法也較傳統分群演算法得出三元詞組的聚落有更高的關係類別純度。最後結合兩個任務所提方法,在具一般關係詞類標籤的來源領域資料輔助下,本論文所提方法只需極少數目標領域中已指定關係類別的三元詞組標示資料,即可達到約 66% 的正確率,且較需大量標示資料的監督式學習關係擷取方法有更高的正確率。
In this paper, we study how to train a model for relation extraction from limited labeled data. We solve the problem by two sub-tasks: 1) triples detection and 2) triples clustering and classification. In the task of triples detection, a tagging model and a sentence classification model are proposed. The strategies of transfer learning, ensemble classifier for different types of sentences, and CRF are combined to extract the triples in a sentence. For the extracted triples, a two-phase clustering algorithm is proposed to discover the groups of triples which have semantics-similar relationship terms. The discovered groups are then assigned to the corresponding relation types by a modified KNN algorithm by a small set of labeled data. Accordingly, the proposed semi-supervised learning strategy can achieve extendable relation extraction. The results of experiments show that, when the BERT model is combined with CRF and the various training strategies, the primitive model can get better tagging prediction. In addition, the proposed two-phase clustering algorithm can obtain a higher purity of relation type in the discovered group of triples compared with the traditional clustering algorithms. Finally, the method proposed in this paper only needs a very small number of labeled triples with specified relation types in the target domain to achieve accuracy 66%, whose performance is better than the supervised learning approach requiring a much larger dataset of labeled triples.
[1] F. Bai, A. Ritter. (2019). Structured Minimally Supervised Learning for Neural Relation Extraction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.
[2] M. Chiang. (2020). Automatic Relation Extraction from a Chinese Corpus through Distant-supervised Learning. In Department of Computer Science and Information Engineering, National Taiwan Normal University.
[3] J. Chiu, E. Nichols. (2016). Named Entity Recognition with Bidirectional LSTM-CNNs. In Proceedings of Transactions of the Association for Computational Linguistics.
[4] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, P. Kuksa. (2011). Natural Language Processing (Almost) from Scratch. In Proceedings of Journal of Machine Learning Research 12, Aug (2011).
[5] J. Devlin, M. Chang, K. Lee, K. Toutanova. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In arXiv:1810.04805 [cs.CL].
[6] M. Ester, H.P. Kriegel, J. Sander, X. Xu. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery.
[7] M R. Gormley, M. Yu, M. Dredze. (2015). Improved Relation Extraction with Feature-Rich Compositional Embedding Models. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.
[8] S. Guha, R. Rastogi, K. Shim. (1998). CURE : A clustering algorithm for large databases. In Proceedings of the ACM SIGMOD Conference on Management of Data.
[9] Z. Huang, W. Xu, K. Yu. (2015). Bidirectional LSTM-CRF Models for Sequence Tagging. In arXiv:1508.01991 [cs.CL].
[10] T. Kanungo, D.M. Mount, N.S. Netanyahu. C.D. Piatko, R. Silverman, A.Y. Wu. (2002). An efficient k-means clustering algorithm: analysis and implementation. In Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, C. Dyer. (2016). Neural Architectures for Named Entity Recognition. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
[12] J. Lee, F. Dernoncourt, P. Szolovits. (2018). Transfer Learning for Named-Entity Recognition with Neural Networks. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
[13] Q. Li, H. Ji. (2014). Incremental Joint Extraction of Entity Mentions and Relations. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.
[14] Y. Lin, S. Shen, Z. Liu, H. Luan, M. Sun. (2016). Neural Relation Extraction with Selective Attention over Instances. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.
[15] T. Liu, X. Zhang, W. Zhou, W. Jia. (2018). Neural Relation Extraction via Inner-Sentence Noise Reduction and Transfer Learning. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational .
[16] L. Luo1, Z. Yang, P. Yang, Y. Zhang, L. Wang, H. Lin, J. Wang. (2018). An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. In Proceedings of Bioinformatics, Volume 34, Issue 8.
[17] M. Mintz, S. Bills, R. Snow, D. Jurafsky. (2009). Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.
[18] M. Miwa, M. Bansal. (2016). End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.
[19] M. Miwa, Y. Sasaki. (2014). Modeling Joint Entity and Relation Extraction with Table Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).
[20] B. Plank, A. Sogaard, Y. Goldberg. (2016). Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.
[21] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A N. Gomez, L. Kaiser, I. Polosukhin. (2017). Attention Is All You Need. In Proceedings of 31st Conference on Neural Information Processing Systems (NIPS 2017).
[22] Z. Wei, Y. Jia, Y. Tian, M J. Hosseini, M. Steedman, Y. Chang. (2019). Joint Extraction of Entities and Relations with a Hierarchical Multi-task Tagging Model. In arXiv:1908.08672 [cs.CL].
[23] Z. Yang, R. Salakhutdinov, W. Cohen. (2017). Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks. In Proceedings of ICLR 2017.
[24] S. Zheng, F. Wang, H. Bao, Y. Hao, P. Zhou, B. Xu. (2018). Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.
[25] G. Zhou, J. Su. (2002). Named Entity Recognition using an HMM-based Chunk Tagger. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.
[26] Y. Zhou, L.Huang, T. Guo, S. Hu, J. Han. (2019). An Attention-based Model for Joint Extraction of Entities and Relations with Implicit Entity Features. In Proceedings of WWW '19: Companion Proceedings of The 2019 World Wide Web Conference.