簡易檢索 / 詳目顯示

研究生: 陳思穎
Chen, Sih-Ying
論文名稱: 自動分群搜尋引擎之使用者評估研究
User-based Study of Automatic Clustering Search Engines
指導教授: 卜小蝶
Pu, Hsiao-Tieh
學位類別: 碩士
Master
系所名稱: 圖書資訊學研究所
Graduate Institute of Library and Information Studies
論文出版年: 2007
畢業學年度: 95
語文別: 中文
論文頁數: 227
中文關鍵詞: 自動分群相關排序搜尋引擎使用者研究
英文關鍵詞: automatic clustering, relevance ranking, earch engines, user study
論文種類: 學術論文
相關次數: 點閱:122下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著網路資源迅速成長,利用搜尋引擎檢索網路資源,也成為使用者最有利的檢索工具。然而,現今以相關排序為主的搜尋引擎,仍無法有效地過濾龐雜的檢索結果,反而容易造成使用者的困擾。自動分群搜尋引擎則提供使用者另一種選擇,以自動分群的方式提供使用者分門別類的群集主題,藉以改善檢索效益。

    雖然自動分群技術之相關研究已行之有年,但仍缺少使用者方面之相關研究,因此本研究即嘗試以使用者觀點評估群集架構及其使用性。本研究設計兩階段實驗,讓使用者實際參與任務,包含使用者自訂情境與研究者指定情境,並從中觀察使用者使用自動分群搜尋引擎之情形,再利用歷程分析紀錄使用者之檢索歷程,搭配問卷與訪談深入瞭解使用者主觀認知。研究中採用自建之實驗平台、檢索任務及評估問卷、訪談大綱、電腦螢幕錄製軟體等,以利研究之進行。並依據研究目的與問題獲取所需之研究素材,以觀察及訪談方式整理歸納受試者之檢索行為特性,及利用檢索歷程紀錄、訪談、評估問卷等方式,分析搜尋引擎之檢索效率與效益,最後則以訪談與評估問卷分析使用者滿意度。

    研究結果發現,使用者使用自動分群搜尋引擎與相關排序搜尋引擎之最大差異在於,使用者過濾資訊時所花費的時間與心力,以及使用之時機皆有所影響;其次,自動分群搜尋引擎對使用者最大的幫助在於,有較佳的檢索效益、可以縮小檢索主題範圍、突顯重要概念、提供多維思考方向,並在簡單/封閉的問題有最佳檢索表現。最後,本研究提出改善自動分群搜尋引擎之建議,包括依據使用者所需提供常用之群集類別與組合、與自訂層級及其檢索結果數量等個人化的服務,或是參考人工分類以及使用者回饋等方式,以使用者為導向提升分群的品質。

    This study tries to evaluate the structure and the usability of automatic clustering search engines based on the user’s perspectives. The rapidly increasing amount of Internet resources has made search engines one of the most important tools for searching and accessing Internet resources. The results of major search engines found and presented are based on relevance ranking. They may not be able to effectively and efficiently filter the results since they have brought difficulties in terms of precisely locating what the user is seeking for because of the intimidating amount of the results found and presented. Automatic clustering search engines have offered the user a better option. They provide the user with the function of automatically clustering the categories by topics and may thus increase the effectiveness of searching.

    This study designs experiments that enabled the user to actually participate in the searching tasks of using automatic clustering search engines. The tasks that this study designs include self-defined tasks by the participants and the tasks that this study assigns. This study observes the participants’ behaviors during the process of using the automatic clustering search engines assigned. It also records the searching process by log to analyze the effectiveness of the search engines selected. Finally, it tries to understand the satisfaction of the user by interviews and evaluation questionnaire after the tasks assigned were completed.. The research tools that this study adopts include the experimental platform, searching tasks, evaluation questionnaire, interviews outline, and log analysis software.

    The results of this study suggests that the time spent and the efforts made by the participant on using the automatic clustering search engines selected are strongly different. The contexts in which the participant used the automatic clustering search engines are also highly different from using the search engines based on relevance-ranking. The results also showed that the automatic clustering search engines selected help enhance the effectiveness of the searching, narrow the topic scope of the searching, highlight key concepts, and provide diverse thinking for searching. Finally, this study provides suggestion to improve automatic clustering search engines. Various combinations of the clusters that the user frequently uses, personalized hierarchical clusters and self-defined numbers of the searching results presented, human-designed clusters and user’s feedback may be provided to enhance the quality of the clustering based on the user.

    誌謝 i 摘要 ii Abstract iv 目次 vi 表目次 viii 圖目次 x 第一章 緒論 1 第一節 研究背景 1 第二節 研究目的與問題 5 第三節 研究限制與範圍 7 第四節 名詞解釋 8 第二章 文獻探討 13 第一節 分類架構之意涵 13 第二節 圖書資訊分類架構與網路資訊檢索 18 第三節 分類架構設計與資訊架構 24 第四節 自動分群技術 30 第五節 自動分群搜尋引擎之評估研究 36 第六節 自動分群搜尋引擎介紹 40 第三章 研究方法與設計 47 第一節 研究流程與架構 47 第二節 研究對象 51 第三節 研究方法與設計 53 第四節 研究工具與資料分析 56 第五節 實驗實施程序 66 第四章 研究分析結果 73 第一節 受試者背景分析 73 第二節 檢索行為特性 75 第三節 檢索效率分析 83 第四節 檢索效益分析 88 第五節 滿意度分析 99 第六節 問題類型與檢索歷程 103 第七節 群集分析 113 第八節 自動分群搜尋引擎之評價與使用情境 129 第九節 綜合討論 147 第五章 結論與建議 155 第一節 結論 155 第二節 改善建議 160 第三節 後續研究建議 164 參考文獻 167 附錄一 基本資料問卷 175 附錄二 指定任務 176 附錄三 自訂任務 178 附錄四 指定任務定義說明 179 附錄五 搜尋引擎評估表 180 附錄六 自動分群搜尋引擎評估表 181 附錄七 訪談大綱 182 附錄八 訪談紀錄編碼範例 183 附錄九 受試者基本資料 188 附錄十 檢索結果之瀏覽 189 附錄十一 搜尋引擎使用習慣 193 附錄十二 檢索結果過濾 195 附錄十三 相關判斷依據 198 附錄十四 檢索歷程分析範例 200 附錄十五 檢索歷程分析項目說明 202 附錄十六 檢索表現訪談稿分析 203 附錄十七 自訂任務需求描述 211 附錄十八 自動分群搜尋引擎之評價 212 附錄十九 自動分群搜尋引擎之使用情境 222

    張郁蔚(民93)。相關排序於資訊檢索之發展與探討。大學圖書館,8卷2期,頁94-123。

    張淇龍、卜小蝶(民95)。淺談Web2.0與通俗分類於圖書資訊服務之應用。圖書與資訊學刊,57期,頁74-93。

    陳光華、莊雅蓁(民90)。應用於資訊檢索的中文同義詞之建構。中國圖書館學會會報,67期,頁93-107。

    楊瑋琳(民95)。以動態階層分群技術為基礎建立虛擬文件倉儲系統。臺灣大學資訊管理學研究所碩士論文,未出版,台北市。

    蔡景祥(民94)。網路搜尋結果自動組織之研究。臺灣大學資訊管理學研究所碩士論文,未出版,台北市。

    Amitay, E. (1998). Using common hypertext links to identify the best phrasal description of target web document. Proceedings of the SIGIR'98 Post-Conference Workshop on Hypertext Information Retrieval for the Web.

    Barker, I. (2005). What is information architecture?. Retrieved Oct. 11, 2006, from http://www.steptwo.com.au

    Bates, M. (2002). After the dot-bomb: getting web information retrieval right this time. First Monday, 7(7).

    Belkin, N.J., Cool, C., Kelly, D., Lin, S.J., Park, S.Y., Perez-Carballo, J., Sikora, C. (2001). Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing and Management, 37(3), 403-434.

    Belkin, N.J., Scholtz, J., Dumais, S., Wilkinson, R. (2004). Evaluating interactive information retrieval systems: opportunities and challenges. Conference on Human Factors in Computing Systems, Vienna, Austria, 1594-1595.

    Berkhin, P. (2002). Survey of clustering data mining techniques. Technical report, Accrue Software, San Jose, CA. Retrieved Oct. 11, 2006, from http://scholar.google.com/url?sa=U&q=http://www.it.bond.edu.au/inft623/053/Downloads/cluster_review.pdf

    Beyond Google: Narrow the Search. (2004, Jun. 4). Wired.com. Retrieved April 3, 2006, from http://www.wired.com/science/discoveries/news/2004/01/61783

    Chen, H., & Dumais, S. (2000). Bringing order to the Web: Automatically categorizing search results. Paper presented at the SIGCHI conference on Human Factors in Computing, Hague, The Netherlands, pp.145-152. New York: ACM Press.

    Chien, L.-F. & Pu, H.-T. (1996). Important Issues on Chinese. Information Retrieval. Computational Linguistics and Chinese Language Processing, 1(1), 205-221.

    Cisco, S.L., & Jackson, W.K.(2005) Creating Order out of Chaos with Taxonomies. Information Management Journal, 39(3). Retrieved April. 11, 2007, from http://findarticles.com/p/articles/mi_qa3937/is_200505/ai_n13638950

    Crabtree, D., Gao, X., & Andreae, P. (2005). Standardized Evaluation Method for Web Clustering Results. Proceedings of the 2005 IEEE/ACM International Conference on Web Intelligence.

    Cross, P., et al. (2000). Subject classification, browsing and searching. In M. Belcher, V. Knight, & E. Place (Eds.), DESIRE Information Gateways Handbook. http://www.carnet.hr/CUC/cuc2000/handbook/handbook.pdf

    Delphi Group (2004). Information Intelligence: Content Classification and the Enterprise Taxonomy Practice. Retrieved Dec. 21, 2005, from http://stratify.com/infocenter/download/DelphiResearchReport2004.pdf

    Delphi Group(2002). Taxonomy & content classification. Retrieved Dec. 14, 2005, from http://www.entrieva.com/entrieva/downloads/delphitaxonomywhitepaper.pdf

    Ellis, D. & Vasconcelos, A. (1999). Ranganathan and the Net: Using facet analysis to search and organise the World Wide Web. Aslib Proceedings, 51(1), 3-10.

    Ferragina, P. & Gulli, A. (2005). A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering. In Proceedings of Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, Chiba, Japan, pp801-810.

    Frakes, W.B. & Ricardo B.Y. (1992). Information Retrieval-Data Structures & Algorithms. Prentice Hall: New Jersey, pp. 419-442.

    Garrett, J.J.(2003). The Elements of User Experience: User-Centered Design for the Web. New York: AIGA.

    Gilchrist, A. (2003). Thesauri, taxonomies and ontologies – An etymological note. Journal of Documentation, 59(1), 7-18.

    Hackos, B. (2005). Taxonomies – Lessons from users. CIDM Information Management News October 2005. Retrieved Dec. 18, 2005, from http://www.infomanagementcenter.com/enewsletter/200510/fourth.html

    Harvey, R. (1999). Organising Knowledge in Australia. New South Wales: Center for Information Studies.

    Hearst, M.A. (2006). Clustering Versus Facted Categories for Information Exploration. Communications of the ACM, 49(4).

    Huang, C.K., Chien, L.F., & Oyang, Y.J. (2003). Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, 54, 638-49.

    Jacob, E.K. (2004). Classification and categorization: A difference that makes a difference. Library Trends, 52(3), 515-540.

    Jain, K., Murty, M.N., & Flynn, P.J. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 265-323.

    Jansen, B.J., Spink, A., & Saracevic, T. (2000). Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management, 36(2), 207-27.

    Käki, M.& Aula, A.(2005). Findex: Improving Search Result Use Through Automatic Filtering Categories. Interacting with Computers. Elsevier, 17(2), 187-206.

    Käki, M.(2005a). Enhancing Web Search Result Access with Automatic Categorization. Unpublished Doctoral dissertation of Computer Science, University of Tampere, Finland.

    Käki, M.(2005b). Findex: Properties of Two Web Search Result Categorizing Algorithms. In Proceedings of the IADIS International Conference on World Wide Web/Internet (Lisbon, Portugal), Oct. 2005. IADIS Press, pp. 93-100.

    Käki, M.(2005c). Findex: Search Result Categories Help Users When Document Ranking Fails. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2005 (Portland, USA), April 2005. ACM Press, pp. 131-140.

    Käki, M.(2005d). Optimizing the Number of Search Result Categories. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2005 (Portland, USA), April 2005. ACM Press, pp. 1517-1520.

    Käki, M.(2005e). Proportional Search Interface Usability Measures. In Proceedings of NordiCHI 2004 (Tampere, Finland), 23-27 Oct. 2004. ACM Press, pp. 365-372.

    Käki, M.(2006). fKWIC: Frequency Based Keyword-in-Context Index for Filtering Web Search Results. Journal of the American Society for Information Science and Technology, 57(12), 1606-1615.

    Koch, T. & Day, M.(1997). The role of classification schemes in Internet resource description and discovery, DESIRE D3.2 (3). Retrieved Dec. 21, 2005, from http://www.ukoln.ac.uk/metadata/desire/classification/classification.pdf

    Koshman, S., Spink, A., & Jansen, B.J.(2006). Web Search on the Vivisimo Search Engine. Journal of the American Society for Information Science and Technology, 57(14), 1875-1887.

    Kwasnik, B.H. (1999). The role of classification in knowledge representation and discovery. Library Trends, 48(1), 22-47.

    Leouski, A.V., & Croft, W.B. (1996). An evaluation of techniques for clustering search results. Technical Report IR-76, Department of Computer Science, University of Massachusetts, Amherst.

    Mai, J.-E. (2004). Classification of the Web: Challenges and inquiries. Knowledge Organization, 31(2), 92-97.

    Mayr, E. (1982). The growth of biological thought: Diversity, evolution, and inheritance. Cambridge, MA: Harvard University Press.

    Netcraft (2006). May 2006 Web Server Survey. Retrieved May 16,2006, from http://news.netcraft.com/archives/2006/05/09/may_2006_web_server_survey.html

    Pew Internet & American Life Project (2005). Reports: Online Activities and Pursuits. Retrieved Dec. 21, 2005, from http://www.pewinternet.org/PPF/r/167/report_display.asp

    Pu, H.T., Chuang, S.L., & Yang, C. (2002). Subject categorization of query terms for exploring web users' search interests. Journal of the American Society for Information Science & Technology, 53(8), 617-630.

    Rajashekar, T.B. (2004). IS 206 – Information and knowledge organization. Retrieved Dec. 10, 2005, from http://144.16.72.189/is206/topic-12.htm

    Rivadeneira, W., & Bederson, B.(2003). A Study of Search Result Clustering Interfaces: Comparing Textual and Zoomable User Interfaces. University of Maryland HCIL Technical Report HCIL-2003-36.

    Rosenfeld, L., & Morville, P. (2002). Information Architecture for the World Wide Web. 2nd ed. Sebastopol, CA.: O'Reilly.

    Schwartz, C. (2001). Sorting out the Web: Approaches to Subject Access. Stamford, Conn.: Ablex Pub.

    SearchTools.com (2003). Taxonomies, categorization, classification, categories, and directories for searching. Retrieved Dec. 21, 2005, from http://www.searchtools.com/info/classifiers.html

    Sebrechts, M., Vasilakis, J., Miller, M.S., Cugini, J.V., & Laskowski, S.J.(1999). Visualization of Search Results: A Comparative Evaluation of Text, 2D, and 3D Interfaces. In Proceedings of SIGIR 1999, pp. 3-10.

    Silverstein, C., Henzinger, M., & Marais, H. (1998). Analysis of a very large AltaVista query log. Digital System Research Center Technical Report, 1998-014.

    Spink, A., Wolfram, D., Jansen, M.J., & Saracevic, T. (2001). Searching the web: The public and their queries. Journal of the American Society for Information Science and Technology, 52(3), 226-234.

    Sravanapudi, A. (2004). Categorization – It’s all about context. EContent, 27(7/8), S23.

    Su, L.T. (2003). A comprehensive and systematic model of user evaluation of web search engines: I. theory and background. Journal of the American Society for Information Science and Technology, 54(13), 1175-1192.

    Tonella, P., Ricca, F., Pianta, E., Girardi, C., Lucca, G.D., Fasolino, A.R., & Tramontana, P. (2003). Evaluation Methods for Web Application Clustering. wse, p. 33, 5th International Workshop on Web Site Evolution.

    Valdes-Perez, R. (2007). How to Evaluate a Clustering Search Engine. Retrieved April. 11, 2007, from http://searchdoneright.com/2007/03/how-to-evaluate-a-clustering-search-engine/
    Vogel, C. (2003a). A roadmap for proper taxonomy design. Computer Technology Review, 23(7), 42-44.

    Vogel, C. (2003b). Designing a knowledge discovery system. Computer Technology Review, 23(10), 42-43.

    Xu, R., & Wunsch, II, D. (2005). Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16(3), 645-678.

    Zamir O. & Etzioni, O. (1999). Grouper: A Dynamic Clustering Interface to Web Search Results. In Proceedings of the Eighth International World Wide Web Conference(WWW8), Toronto, Canada. Retrieved Oct. 11, 2006, from:http://www8.org/w8-papers/3a-search-query/dynamic/dynamic.html

    Zamir, O. & Etzioni, O.(1998).Web document clustering: a feasibility demonstration. In Proceeding of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia. pp. 46-54.

    Zamir, O.(1999). Clustering Web Documents: A Phrase-Based Method for Grouping Search Engine Results. Unpublished Doctoral dissertation of Computer Science & Engineering, University of Washington.

    QR CODE