研究生: |
曾琬婷 Tseng, Wan-Ting |
---|---|
論文名稱: |
結合圖與上下文語言模型技術於常見問答檢索之研究 A Study on the Combination of Graphs and Contextualized Language Models for FAQ Retrieval |
指導教授: |
陳柏琳
Chen, Berlin |
口試委員: |
洪志偉
Hung, Jeih-Weih 林伯慎 Lin, Bor-Shen 陳冠宇 Chen, Kuan-Yu |
口試日期: | 2021/07/30 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 51 |
中文關鍵詞: | 常見問答集檢索 、知識圖譜 、自然語言處理 、資訊檢索 、深度學習 、圖卷積神經網路 |
英文關鍵詞: | Frequently Asked Question, Knowledge Graph, Natural Language Processing, Information Retrieval, Deep Learning, Graph Convolutional Network |
研究方法: | 實驗設計法 、 比較研究 |
DOI URL: | http://doi.org/10.6345/NTNU202101175 |
論文種類: | 學術論文 |
相關次數: | 點閱:255 下載:54 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,深度學習技術有突破性的發展,並在很多自然語言處理的相關應用領域上也有相當亮眼的效能表現。而且大量資訊快速得傳播,如何更有效地取資訊仍是一項重要的課題,其中FAQ (Frequently Asked Question)檢索任務也成為重要的技術之一。
FAQ檢索無論在電子商務服務或是線上論壇等許多領域都有廣泛的應用;其目的在於依照使用者的查詢(問題)來提供相對應最適合的答案。至今,已有出數種FAQ檢索的策略被提出,像是透過比較使用者查詢和標準問句的相似度、使用者查詢與標準問句對應的答案之間相關性,或是將使用者查詢做分類。因此,也有許多新穎的基於上下文的深層類神經網路語言模型被用於以實現上述策略;例如,BERT(Bidirectional Encoder Representations from Transformers),以及它的延伸像是K-BERT或是Sentence-BERT等。儘管BERT以及它的延伸在FAQ檢索任務上已獲得不錯的效果,但是對於需要一般領域知識的FAQ任務仍有改進空間。
因此,本論文中總共分成五大階段做研究。首先探討三種不同FAQ檢索策略同時比較不同策略和方法的結合在FAQ檢索任務之表現。第二,討論如何透過使用知識圖譜等的額外資訊來強化BERT在FAQ檢索任務上之效能,並提出利用非監督式的知識圖譜注入增進模型。第三,透過監督式方法和非監督式方法結合來改進FAQ檢索多種答案型態造成模型效果不佳之情形。第四,透過投票機制(voting mechanism)做重新排序再次改良模型效果。最後,我們透過圖卷積神經網路(Graph Convolutional Network, GCN)結合上下文語言模型(BERT)的方式使得模型可以透過建立異質圖(Heterogeneous graph)考慮到查詢(問題)之間的關聯性。我們在中文臺北市政府問答語料(TaipeiQA)進行一連串的實驗同時針對資料擴增(Data augmentation)的方法做研究探討。由實驗結果顯示,我們所提出的方法可以使得一般的FAQ檢索應用有某些程度上效果的提升。
Recent years have witnessed significant progress in the development of deep learning techniques, which also has achieved state-of-the-art performance for a wide variety ,of natural language processing (NLP) applications. With the rapid spread of tremendous amounts of information, how to browse the content become an essential research issue. Among them, FAQ (Frequently Asked Question) retrieval task has also become one of the important technologies.
FAQ retrieval, which manages to provide relevant information in response to frequent questions or concerns, has far-reaching applications such as e-commerce services and online forums, among many other applications. In the common setting of the FAQ retrieval task, a collection of question-answer (Q-A) pairs compiled in advance can be capitalized to retrieve an appropriate answer in response to a user’s query that is likely to reoccur frequently. To date, there have many strategies proposed to approach FAQ retrieval, ranging from comparing the similarity between the query and a question, to scoring the relevance between the query and the associated answer of a question, and performing classification on user queries. As such, a variety of contextualized language models have been extended and developed to operationalize the aforementioned strategies, like BERT (Bidirectional Encoder Representations from Transformers), K-BERT and Sentence-BERT. Although BERT and its variants has demonstrated reasonably good results on various FAQ retrieval tasks, they still would fall short for some tasks that may resort to generic knowledge.
In view of this, in this paper, we divided it into five major stages for research. First, we discuss three different FAQ retrieval strategies and meanwhile comparing among synergistic effects of different strategies and methods. Second, we set out to explore the utility of injecting an extra knowledge base into BERT for FAQ retrieval, and propose the method of unsupervised knowledge graph injection for model. Third, we have presented an effective, hybrid approach for FAQ retrieval, exploring the synergistic effect of combing unsupervised IR method and supervised contextual language models In addition, an effective voting mechanism to rerank answer hypotheses for better performance is proposed. Finally, we put forward construct a heterogeneous graph network and combined graph convolutional network (GCN) and contextualized language model (BERT) in order to consider about the global question-question, question-word and word-word relations which can be used to augment the embeddings derived from BERT for better FAQ retrieval. we conduct extensive experiments to evaluate the utility of the proposed approaches on a publicly-available FAQ dataset (viz. TaipeiQA), where the associated results confirm the promising efficacy of the proposed approach in comparison to some state-of-the-art ones.
[1] Wataru Sakata, Tomohide Shibata, Ribeka Tanaka, and Sadao Kurohashi, “FAQ retrieval using query-question similarity and BERT-based query-answer relevance,” In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1113–1116, 2019.
[2] Mladen Karan and Jan Šnajder, “Paraphrase-focused learning to rank for domain-specific frequently asked questions retrieval. Expert Systems with Applications,” 91: 418-433, 2018.
[3] Yu-Sheng Lai, Kuen-Lin Lee, and Chung-Hsien Wu, “Intention Extraction and Semantic Matching for Internet FAQ Retrieval Using Spoken Language Query,” In Proceedings of Research on Computational Linguistics Conference XIII, 2000.
[4] Robin D Burke, Kristian J. Hammond, Vladimir Kulyukin, Steven L. Lytinen, Noriko Tomuro and Noriko Tomuro, “Question answering from frequently asked question files: Experiences with the faq finder system,” AI magazine 18.2: 57-57, 1997.
[5] Kristian Hammond, Robin Burke, and Charles Martin, “FAQ finder: a case-based approach to knowledge navigation,” In Proceedings the 11th Conference on Artificial Intelligence for Applications, IEEE, 1995.
[6] Ask Jeeves, [Online]. Available: http://www.ask.com
[7] Danish Contractor, Govind Kothari, Tanveer A. Faruquie, L. Venkata Subramaniam and Sumit Negi, “Handling noisy queries in cross language faq retrieval,” In Proceedings of the 2010 conference on empirical methods in natural language processing, 2010.
[8] Mladen Karan and Jan Šnajder. “FAQIR–a frequently asked questions retrieval test collection,” In International Conference on Text, Speech, and Dialogue. Springer, Cham, 2016.
[9] Eriks Sneiders. “Automated FAQ answering with question-specific knowledge representation for web self-service,” In 2009 2nd Conference on Human System Interactions, IEEE, 2009.
[10] Chung-Hsien Wu, Jui-Feng Yeh, and Ming-Jun Chen. “Domain-specific FAQ retrieval using independent aspects,” In ACM Transactions on Asian Language Information Processing (TALIP), 4(1),1-17, 2005.
[11] Stephen Robertson and Hugo Zaragoza, “The probabilistic relevance framework: BM25 and beyond,” Foundations and Trends in Information Retrieval, 3(4): 333–389, 2009.
[12] Gerard Salton, Anita Wong, and Chung-Shu Yang, “A vector space model for automatic indexing,” Communications of the ACM, 18(11), pages 613–620, 1975.
[13] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, “A comprehensive survey on graph neural networks,” IEEE Transactions on Neural Networks and Learning Systems. 2020.
[14] Zhong Min Juan. “An effective similarity measurement for faq question answering system,” In 2010 International Conference on Electrical and Control Engineering, IEEE, 2010.
[15] Arash Habibi Lashkari, Fereshteh Mahdavi, and Vahid Ghomi, “A boolean model in information retrieval for search engines,” International Conference on Information Management and Engineering. IEEE, 2009.
[16] Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau, “The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems,” arXiv preprint arXiv:1506.08909, 2015.
[17] Ming Tan, Cicero dos Santos, Bing Xiang, and Bowen Zhou, “Lstm-based deep learning models for non-factoid answer selection,” arXiv preprint arXiv:1511.04108, 2015.
[18] Rui Yan, Yiping Song, and Hua Wu, “Learning to respond with deep neural networks for retrieval-based human-computer conversation system,” In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016.
[19] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2019.
[20] Weijie Liu, Peng Zhou, Zhe Zhao, Zhiruo Wang, Qi Ju, Haotang Deng, Ping Wang “K-BERT: enabling language representation with knowledge graph,” In Proceedings of the AAAI Conference on Artificial Intelligence AAAI, pages 2901–2908, 2020.
[21] Nils Reimers and Iryna Gurevych. “Sentence-bert: Sentence embeddings using siamese BERT-networks,” arXiv preprint arXiv:1908.10084, 2019.
[22] 自然語言處理-维基百科, [Online]. Available: https://zh.wikipedia.org/wiki/自然语言处理
[23] Elizabeth D. Liddy,“Natural language processing,” 2001.
[24] Prakash M Nadkarni, Lucila Ohno-Machado, and Wendy W Chapman, “Natural language processing: an introduction,” In Journal of the American Medical Informatics Association 18.5: 544-551, 2011.
[25] Gobinda G Chowdhury. “Natural language processing,” In Annual review of information science and technology,37(1), 51-89, 2003.
[26] Tom Young, Devamanyu Hazarika, Soujanya Poria, Erik Cambria. “Recent trends in deep learning based natural language processing,” ieee Computational intelligenCe magazine 13.3: 55-75, 2018.
[27] Yosi Mass, Boaz Carmeli, Haggai Roitman, and David Konopnicki, “Unsupervised FAQ retrieval with question generation and BERT,” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.
[28] Mladen Karan, Lovro Žmak, and Jan Šnajder, “Frequently asked questions retrieval for Croatian based on semantic textual similarity,” In Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing, 2013.
[29] Robin D Burke, Kristian J. Hammond, Vladimir Kulyukin, Steven L. Lytinen, Noriko Tomuro, and Scott Schoenberg, “Question answering from frequently asked question files: Experiences with the faq finder system,” AI magazine 18.2: 57-57, 1997.
[30] Elad Hoffer and Nir Ailon. “Deep metric learning using triplet network,” In International workshop on similarity-based pattern recognition. Springer, Cham, 2015.
[31] Hang Li. “A short introduction to learning to rank,” IEICE TRANSACTIONS on Information and Systems 94.10: 1854-1862,2011.
[32] Gobinda G Chowdhury. “Introduction to modern information retrieval,” Facet publishing, 2010.
[33] William Hersh. “Information retrieval,” Biomedical Informatics. Springer, Cham, 755-794,2021.
[34] Ricardo Baeza-Yates and Berthier Ribeiro-Neto. “Modern information retrieval,” Vol. 463. New York: ACM press, 1999.
[35] Mathias Géry and Christine Largeron. “BM25t: a BM25 extension for focused information retrieval,” Knowledge and information systems 32.1: 217-241, 2012.
[36] Stephen Robertson and Hugo Zaragoza. “The probabilistic relevance framework: BM25 and beyond,” Now Publishers Inc, 2009.
[37] Vesa Siivola and Bryan L. Pellom. “Growing an n-gram language model,” Ninth European Conference on Speech Communication and Technology, 2005.
[38] Alex Sherstinsky. “Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network,” Physica D: Nonlinear Phenomena, 404, 132306, 2020.
[39] Matthew E. Peters, Mark Neumann, Mohit Iyyer, and Matt Gardner, “Deep contextualized word representations,” In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2227–2237, 2018.
[40] Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever, “Improving language understanding by generative pre-training,”, 2018.
[41] Zonghan Wu, Shirui Pan, Fengwen Chen, and Guodong Long, “A comprehensive survey on graph neural networks,” IEEE transactions on neural networks and learning systems, 2020.
[42] Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun, “Graph neural networks: A review of methods and applications,” arXiv preprint arXiv:1812.08434, 2018.
[43] Ziwei Zhang, Peng Cui, and Wenwu Zhu. “Deep learning on graphs: A survey,” IEEE Transactions on Knowledge and Data Engineering, 2020.
[44] Graph Convolutional Networks, [Online]. Available: https://tkipf.github.io/graph-convolutional-networks/
[45] Thomas N. Kipf and Max Welling. “Variational graph auto-encoders,” arXiv preprint arXiv:1611.07308, 2016.
[46] Wading through Graph Neural Networks, [Online]. Available: https://medium.com/analytics-vidhya/wading-through-graph-neural-networks-968f2ef138af
[47] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
[48] Shaoxiong Ji, Shirui Pan, Erik Cambria, and Pekka Marttinen, “A survey on knowledge graphs: Representation, acquisition and applications,” arXiv preprint arXiv:2002.00388, 2020.
[49] Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum, “YAGO: a core of semantic knowledge,” In Proceedings of the international conference on World Wide Web, pages 697–706, 2007.
[50] Zhendong Dong, Qiang, and Changling Hao, “HowNet and Its Computation of Meaning,” In Proceedings of the International Conference on Computational Linguistics, pages 53–56, 2010.
[51] Marina Sokolova and Guy Lapalme, “A systematic analysis of performance measures for classification tasks,” Information processing & management, 45(4): 427–437, 2009.
[52] Norbert Fuhr, “Some common mistakes in IR evaluation, and how they can be avoided,” ACM SIGIR Forum. Vol. 51. No. 3. New York, NY, USA: ACM, 2018.