研究生: |
鄭舜宸 Shun-Chen, Cheng |
---|---|
論文名稱: |
提供網頁搜尋結果篩選之查詢字詞推薦 Two-level Query Suggestion for Specialization on Web Search Results |
指導教授: |
柯佳伶
Koh, Jia-Ling |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 中文 |
論文頁數: | 70 |
中文關鍵詞: | 查詢字推薦 、階層式推薦 、隨機漫走 |
英文關鍵詞: | query suggestions, hierarchical suggestions, random walk |
論文種類: | 學術論文 |
相關次數: | 點閱:135 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究的目標是從搜尋引擎所回傳的大量搜尋結果,評估挑選出一些查詢推薦字,讓使用者透過這些推薦字篩選搜尋結果,以減少使用者瀏覽搜尋結果的負擔。本研究提出一個雙層的查詢字詞推薦方法,稱為M_PhRank,第一層提供概念廣的主題查詢字詞,第二層則呈現語意較明確的次主題查詢字詞。本論文提出的方法主要分為挑選主題查詢字詞,計算單字語意明確度以及挑選次主題查詢字詞三部分。在第一部分,針對前處理後留下的單字藉由涵蓋的資料物件數量作為挑選依據,將主題查詢字詞作為第一階層的推薦。第二部分建立單字之間的鄰近位置出現的關係圖,以此關係圖透過隨機漫步演算法,計算各個候選字在該搜尋結果中的語意明確程度。最後,基於給定的推薦字詞之數量,依據主題查詢字詞的涵蓋率做比例分配,評估其第二層可推薦之數量進而挑選推薦字詞,完成階層架構之建置。實驗顯示M_PhRank比基準方法能涵蓋更多查詢結果關聯度高的物件,且能降低涵蓋率提升時重複率增加的幅度;另外,從使用者評估的實驗結果顯示, M_PhRank所建立的查詢推薦字架構能提供較好的輔助查詢效果。
The goal of this thesis is to automatically suggest query keywords from the search results returned by the search engine in order to further filter the large amount of search results by using these query keywords as the specialized queries. A two-level query suggestion method, called the M_PhRank, is proposed. The first level suggestion aims to provide the query terms, which can cover search results as many as possible, and the query terms in the second level should have clear meaning and lower overlap between their covered objects. Firstly, the coverage over search results is computed as the novelty score of a word, which is used to select the topic terms in the first level suggestion. Secondly, the semantic scores of words are estimated by using the random walk algorithm on the co-occurrence graph of words. The query keywords consisting of 2-3 non-topic terms form the candidate subtopic terms, whose semantic scores are computed according to the semantic scores of their composing words. According to the given suggestion number, the number of subtopic terms under the topic-terms is decided proportional to the coverage of the topic terms. Finally, the hierarchical query suggestion structure is constructed by the topic terms in first level and their corresponding subtopic terms on the second level. The empirical experiment results show that the M_PhRank method performs better than the baseline method on providing more semantics specific terms and high coverage with limited overlap increasing. Moreover, according to user survey, the hierarchy of query keyword suggestions constructed by M_PhRank gets high satisfaction on query assistance.
[1] Z. Abbassi, V. S. Mirrokni, and M. Thakur. Diversity maximization under matroid constraints. In KDD, pages 32-40, 2013.
[2] R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5-14, 2009.
[3] K. Bache, D. Newman, and P. Smyth. Text-based measures of document diversity. In KDD, pages 23-31, 2013.
[4] Z. Bao, B. Kimelfeld, and Y. Li. Automatic suggestion of query-rewrite rules for enterprise search. In SIGIR, pages 591-600, 2012.
[5] S. Bhatia, D. Majumdar, and P. Mitra. Query suggestions in the absence of query logs. In SIGIR, pages 795-804, 2011.
[6] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3, 2003.
[7] J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335-336, 1998.
[8] V. Dang and W. B. Croft. Diversity by proportionality: An election-based approach to search result diversification. In SIGIR, pages 65-74, 2012.
[9] V. Dang and W. B. Croft. Term level search result diversification. In SIGIR, pages 603-612, 2013.
[10] V. Dang and G. Kumaran, Adam Troy. Domain dependent query reformulation for web search. In CIKM, pages 1045-1054, 2012.
[11] M. P. Kato, T. Sakai, and K. Tanaka. Structured query suggestion for specialization and parallel movement: Effect on search behaviors. In WWW, pages 389-398, 2012.
[12] J. L. Koh and I. C. Chou. An Efficient Approach for mining top-k high utility specialized query expansions on social tagging systems. In DASFAA, pages ,2014.
[13] C. D. Manning and H. Schütze. Foundations of statistical natural language processing. MIT press, 1999.
[14] K. T. Maxwell and W. B. Croft. Compact query Term Selection using topically related text. In SIGIR, pages 583-592, 2013.
[15] T. Nguyen, H. W. Lauw, and P. Tsaparas. Using micro-reviews to select an efficient set of reviews. In CIKM, pages 1067-1076, 2013.
[16] U. Ozertem, O. Chapelle, P. Donmez, and E. Velipasaoglu. Learning to suggest: a machine learning framework for ranking query suggestions. In SIGIR, pages 25-34, 2012.
[17] J. H. Paik. A novel tf-idf weighting scheme for effective ranking. In SIGIR, pages 343-352, 2013.
[18] R. L. T. Santos, C. Macdonald, and I. Ounis. Exploiting query reformulations for web search result diversification. In WWW, pages 881-890, 2010.
[19] D. Skoutas and M. Alrifai. Tag clouds revisited. In CIKM, pages 221-230, 2011.
[20] P. Venetis, G. Koutrika, and H. Garcia-Molina. On the selection of tags for tag clouds. In WSDM, pages 835–844, 2011.
[21] J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR, pages 4-11, 1996.