研究生: |
陳昱年 |
---|---|
論文名稱: |
電影評論中情緒詞彙之極性分析 Polarity Analysis of Sentiment Vocabulary in Movie Reviews |
指導教授: | 侯文娟 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2013 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 54 |
中文關鍵詞: | 自然語言處理 、語意分類 、非監督式學習 、中文處理 |
論文種類: | 學術論文 |
相關次數: | 點閱:300 下載:38 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
目前情緒語意分析的研究,半監督式學習和非監督式學習還屬於初期發展的階段。由於監督式學習情緒分析的研究已經相對於成熟很多,所以非監督式學習的情緒分析將會是未來的研究目標。
過去在詞彙極性分類方面多數人都是採用人工標註的方法,雖然準確度高但是所耗費的時間人力卻是一大問題,而在專業領域中詞彙往往又有不同的意義,也使得人工標註難度更加提高。
本篇論文使用網路論壇上的長篇中文電影評論集,探討中文文章中情緒詞彙的分類,盡可能的把所有在電影領域中可能帶有情緒意義的詞彙進行二元化『正向極性』以及『負向極性』的分類。
在此以非監督式方法進行,過程中不需要人工的介入,使用自定義的語法規則找出『種子詞彙』,接著利用教育部提供的詞典進行『同義詞』和『反義詞』的擴充詞彙,最後使用『模糊比對』等的步驟進行極性分類。
The semi-supervised learning and unsupervised learning are both at the beginning of development in sentiment analysis. Since the supervised learning of sentiment analysis has been well-developed, the unsupervised learning is the objective in further research.
Most people manually made polarity classification in the vocabulary in the past. It can reach high accuracy but the problem rises in consuming much time and human efforts. Furthermore, words often have different meanings in various professional field, it also makes more difficult in improving the manual annotation.
In this thesis, we use Chinese movie reviews with a large number of words in network forum to explore the Chinese polarity classification. We classify the emotional words into two categories "positive polarity" and "negative polarity".
We use unsupervised methods in this thesis. The process of the unsupervised method does not require manual intervention. Then we propose the syntactic rules to identify "seed words". In the following, we use the "synonym set" and "antonym set" from the dictionary provided by the Ministry of Education to expand the "seed words". The "fuzzy match" steps to classify polarity are applied in the final.
Baccianella, Stefano, Esuli, Andrea and Sebastiani, Fabrizio (2010). “SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining,” Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), pp. 2200-2204.
Das, Amitava and Gambäck, Björn (2012). “Sentimantics: Conceptual Spaces for Lexical Sentiment Polarity Representation with Contextuality,” Proc. Department of Computer and Information Science Norwegian University of Science and Technology Sem Sælands vei 7-9, NO-7094 Trondheim, Norway.
Ku, Lun-Wei and Chen, Hsin-Hsi (2007). “Mining Opinions from the Web: Beyond Relevance Retrieval.” Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 58(12), pages 1838-1850. Software available at http://nlg18.csie.ntu.edu.tw:8080/opinion/index.html.
Maas, Andrew L., Daly, Raymond E., Pham, Peter T., Huang, Dan, Ng, Andrew Y. , and Potts, Christopher (2011). “Learning Word Vectors for Sentiment Analysis,” Proc. Stanford University Stanford, CA 94305.
Moilanen, Karo, Pulman, Stephen and Zhang, Yue (2010). “Packed Feelings and Ordered Sentiments: Sentiment Parsing with Quasi-compositional Polarity Sequencing and Compression”, Proceedings of the 1st Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, pp. 36-43.
Seeker, Wolfgang, Bermingham, Adam, Foster, Jennifer and Hogan, Deirdre (2009). “Exploiting Syntax in Sentiment Polarity Classification, National Centre for Language Technology,”Proc. Dublin City University, Ireland.
中文斷詞系統,中文詞知識庫小組,中央研究院,http://ckipsvr.iis.sinica.edu.tw/
李政儒、游基鑫、陳信希,2012,“廣義知網詞彙意見極性的預測 Predicting the Semantic Orientation of Terms in E-HowNet”,國立台灣大學資訊工程所碩士論文。
邱鴻達,2011,“意見探勘在中文電影評論之應用”,國立交通大學資訊科學與工程研究所碩士論文。
陳立,2010,“中文情緒語意自動分類之研究”,國立臺灣師範大學資訊工程所碩士論文。
教育部中文線上詞典,教育部國語推行委員會,中央研究院,http://dict.revised.moe.edu.tw/
張莊平,2012,“中文文法剖析應用於電影評論之意見情緒分析”,國立臺灣師範大學資訊工程所碩士論文。