簡易檢索 / 詳目顯示

研究生: 王思涵
Wang, Szu-Han
論文名稱: 針對問答社群中的事實問題句自動產生答案摘要之研究
Automatic Answer Generation for Factual Questions on Community Question Answering
指導教授: 柯佳伶
Koh, Jia-Ling
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 84
中文關鍵詞: 問題句分類問題句關鍵字擷取自動產生問題句答案
英文關鍵詞: question classification, question keywords extraction, automatic question answering
論文種類: 學術論文
相關次數: 點閱:121下載:22
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著問答社群(Community Question Answering,cQA)平台的發展,越來越多使用者會在平台上提出問題句並等待他人的回答,然而平台上有大部分的問題句無法即時的得到答案,或是根本沒有被回答。因此,本論文研究的目的是針對使用者在問答社群中提出的事實問題句,利用網路搜尋引擎自動判別回傳結果摘要事實資訊,作為問題句的答案提供給使用者。然而若直接以問題當作查詢詞在搜尋引擎進行查詢,查詢詞中可能包含無關的字,導致回傳結果內包含太多不相關答案,因此本研究探討如何對使用者的問題自動分類出是否為事實問題句,並從事實問題句中自動擷取出查詢主體字詞及面向字詞,並以擷取的查詢關鍵字,結合網頁搜尋結果重要面向事實內容自動擷取之研究技術,摘要出事實資訊作為答案提供給使用者。實驗結果顯示本研究所提出的問題分類方法能有效地將問題進行分類,並且透過本研究所擷取的查詢關鍵字結合結果摘要方法,可有效對事實問題句提供事實資訊。

    With the development of Community Question Answering, more and more users post questions on the platform and wait for others to answer. However, the questions posted there did not all get informative answers or were not answered in a timely manner. Accordingly, this thesis aims to automatically summarize the facet information as the answer from the search result for factual questions in CQA. From the summarization result, users can quickly obtain the facet information they need. First, we explore how to automatically classify the factual and the non-factual questions. Second, we extract the target term and facet term from a factual question as the query keywords for search engines. Finally, we apply the technology of search results summarization for getting factual information from the search results. The summary of the factual information is provided to the user as answer of the factual question. The experimental results show that the proposed classification method can identify the factual questions with high accuracy and high recall. Furthermore, by using the query keywords automatically extracted by this study, a factual question can be effectively answered from the facet summarization of web search result.

    附表目錄 i 附圖目錄 ii 第一章 緒論 1 1.1 研究動機 1 1.2 研究目的 2 1.3 研究範圍與限制 3 1.4 論文方法 4 1.5 論文架構 6 第二章 文獻探討 7 2.1 提供問題的答案 7 2.2 問題分類 8 2.3 辨識查詢面向 9 2.3.1 查詢詞推薦 10 2.3.2 查詢詞擴展 10 2.4 事實資訊摘要 12 第三章 問題句分類方法 14 3.1 問題前處理 15 3.2 特徵擷取 16 3.3 分類模型 20 3.3.1 建立問題特徵向量 20 3.3.2 訓練資料蒐集 21 第四章 擷取查詢關鍵字方法 23 4.1 擷取查詢主體字詞 24 4.1.1 產生候選查詢主體字詞 24 4.1.2 蒐集問題相關文件 26 4.1.3 候選查詢主體字詞分數計算 28 4.2 擷取查詢面向字詞 32 4.2.1 產生候選查詢面向字詞 32 4.2.2 查詢面向特徵擷取 34 第五章 查詢結果摘要方法 37 5.1 事實資訊摘要方法 37 5.2 產生查詢主體字詞之面向 40 5.2.1 產生候選查詢主體字詞之面向 40 5.2.2 挑選查詢主體字詞之面向 41 第六章 實驗結果及探討 45 6.1 問題句分類結果評估 45 6.1.1 實驗資料來源及評估方法 45 6.1.2 實驗結果 46 6.2 擷取查詢關鍵字結果評估 47 6.2.1 擷取查詢主體字詞實驗資料來源及評估方法 47 6.2.2 擷取查詢主體字詞實驗結果 48 6.2.3 擷取查詢面向字詞實驗資料來源及評估方法 51 6.2.4 擷取查詢面向字詞實驗結果 51 6.3 查詢結果摘要評估 53 6.3.1 實驗資料來源及評估方法 53 6.3.2 實驗結果 55 第七章 結論與未來研究方向 62 7.1 結論 62 7.2 未來研究方向 63 參考文獻 64 附錄一 事實問題句自動產生答案摘要結果 66

    [1] K. Bae and Y. Ko. An effective category classification method based on a language model for question category recommendation on a cQA service. In CIKM, pages 2255-2258, 2012.
    [2] A. Bouchoucha, J. He and J.-Y. Nie. Diversified query expansion using conceptnet. In CIKM, pages 1861-1864, 2013.
    [3] F. Cai, S. Liang and M. D. Rijke. Time-sensitive personalized query auto-completion. In CIKM, pages 1599-1608, 2014.
    [4] L. Chen, D. Zhang and M. Levene. Understanding user intent in community question answering. In WWW, pages 823-828, 2012.
    [5] V. Dang, G. Kumaran and A. Troy. Domain dependent query reformulation for web search. In CIKM, pages 1045-1054, 2012.
    [6] V. Dang, X. Xue and W. B. Croft. Inferring query aspects from reformulations using clustering. In CIKM, pages 2117-2120, 2011.
    [7] K. T. Maxwell and W. B. Croft. Compact query term selection using topically related text. In SIGIR, pages 583-592, 2013.
    [8] U. Ozertem, O. Chapelle and P. Donmez. Learning to suggest: a machine learning framework for ranking query suggestions. In SIGIR, pages 25-34, 2012.
    [9] J. H. Paik and D. W. Oard. A fixed-point method for weighting terms in verbose informational queries. In CIKM, pages 131-140, 2014.
    [10] A. Shtok, G. Dror and Y. Maarek. Learning from the Past: Answering New Questions with Past Answers. In WWW, pages 759-768, 2012.
    [11] P. Sondhi and C-X. Zhai. Mining Semi-Structured Online Knowledge Bases to Answer Natural Language Questions on Community QA Websites. In CIKM, pages 341-350, 2014.
    [12] W. Song, Q. Yu, Z. Xu, T. Liu, S. Li and J.-R. Wen. Multi-Aspect Query Summarization by Composite Query. In SIGIR, pages 325-334, 2012.
    [13] P.-N. Tan, M. Steinbach and V. Kumar.Introduction to Data Mining: Pearson New International Edition, Agglomerative Hierarchical Clustering, pages516-526.
    [14] S. Vargas, R. L. T. Santos, C. Macdonald and I. Ounis. Selecting effective expansion terms for diversity. In OAIR, pages 69-76, 2013.
    [15] X. Wang, D. Chakrabarti and K. Punera. Mining broad latent query aspects from search sessions. In KDD, pages 867-876, 2009.

    [16] S. Whiting and J. M. Jose. Recent and robust query auto-completion. In WWW, pages 971-982 , 2014.
    [17] F. Wu, J. Madhavan and A. Halevy. Identifying aspects for web-search queries. Journal of Artificial Intelligence Research, pages 677-700, 2011.
    [18] J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR, pages 4-11, 1996.
    [19] X. Xue and W. B. Croft. Modeling Subset Distributions for Verbose Queries. In SIGIR, pages 1133-1134, 2011.
    [20] Y.-H. Yeh. Search results summarization for multiple query aspects. PhD dissertation. Taipei: National Taiwan Normal University Department of Computer Science and Information Engineering, 2014.
    [21] S. Yu, D. Cai, J.-R. Wen, and W.-Y. Ma. Improving pseudo-relevance feedback in web information retrieval using web page segmentation. In WWW, pages 11-18, 2003.
    [22] T. Zhang, J. H. D. Cho, C. Zhai. Understanding user intents in online health forums. In BCB, pages 220-229, 2014.
    [23] LIBSVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/

    下載圖示
    QR CODE