Author: |
吳宣蓉 WU, Hsuan-Jung |
---|---|
Thesis Title: |
面向情緒分析及文件產生之探討:以遊戲評論為例 A study on aspect-based sentiment analysis and text generation: using game reviews |
Advisor: |
侯文娟
Hou, Wen-Juan |
Committee: |
方瓊瑤
Fang, Chiung-Yao 侯文娟 Hou, Wen-Juan 郭俊桔 Kuo, June-Jie |
Approval Date: | 2021/10/20 |
Degree: |
碩士 Master |
Department: |
資訊工程學系 Department of Computer Science and Information Engineering |
Thesis Publication Year: | 2021 |
Academic Year: | 109 |
Language: | 中文 |
Number of pages: | 49 |
Keywords (in Chinese): | 文件產生 、面向情緒分析 、遊戲評論 、GPT-2 |
Keywords (in English): | text generation, aspect-based sentiment analysis, game reviews, GPT-2 |
Research Methods: | 次級資料分析 、 主題分析 、 比較研究 、 文件分析法 、 言談分析 、 內容分析法 |
DOI URL: | http://doi.org/10.6345/NTNU202101739 |
Thesis Type: | Academic thesis/ dissertation |
Reference times: | Clicks: 217 Downloads: 19 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
在機器學習的廣泛運用之前,最常在網路上看見使用文件產生的機制通常為隨機更改一些關鍵字,並產生出一段文字,例如時常在年輕人之間流行的輸入名字就會產生各式各樣結果的測驗。
在以機器學習進行文件產生的研究中,隨著人們在網路上進行聊天、提問、評論、投稿、甚至出版,這些都成為文件產生很好的訓練資料來源。也有許多投入產生更好的文件產生模型的研究,讓產生出來的文件不是只有一段文字,而是有意義的、能讓人讀懂並且邏輯通順的文章。
本文藉由面向情緒和GPT-2模型對遊戲評論進行文件產生的深入探討。不只用評分產生文件,還利用關鍵字抽取面向特徵,再以面向特徵的不同表示方法進行實驗,加以觀察哪一種表示方法,產生出來的文件最能維持遊戲評論的情緒跟面向的情緒。效能的評估方式以正規化方均根差(Normalized root mean square error)作比較。
Before the widely use of machine learning, the most common text generation on the Internet is randomly changing some keywords and generating a simple text. For example, the game of inputting your name and then generating many kinds of text is often popular among young people.
In text generation with machine learning, people chatting, asking questions, commenting, submitting, even publishing on the Internet, have become good resources of training data. There is also a lot of research to produce better text generation models, so that the text generated are not just a paragraph, but a meaningful, understandable, and logical article.
This thesis uses aspect-based sentiment and a GPT-2 model to study text generation of game reviews. Not just using user scores to generate text, but also using keywords to extract aspects, and then experiment with different ways to represent aspects. We want to observe which method can generate game reviews with most similar sentiment and aspect-based sentiment. The performance evaluation method is Normalized root mean square error.
Bernbeck, Sartori. 2015. What drives a review score?. https://www.gamesindustry.biz/articles/2015-02-09-what-drives-a-review-score
Dai, Andrew M. and Le, Quoc V. 2015. Semi-supervised Sequence Learning.Advances in Neural Information Processing Systems 28 (NIPS 2015). https://papers.nips.cc/paper/2015
Devlin, Jacob; Chang, Ming-Wei ; Lee, Kenton and Toutanova, Kristina. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.Proceedings of NAACL-HLT 2019, pages 4171–4186. Minneapolis, Minnesota.
He, Ruidan; Lee, Wee Sun; Ng, Hwee Tou and Dahlmeier, Daniel. 2017. An Unsupervised Neural Attention Model for Aspect Extraction. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pages 388–397. Vancouver, Canada.
Kiritchenko, Svetlana; Zhu, Xiaodan; Cherry, Colin and Mohammad, Saif M. 2014. NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews. Proceedings of the 8th International Workshop on Semantic Ealuation (SemEval 2014), pages 437–442. Dublin, Ireland.
Kobayashi, Nozomi; Inui, Kentaro and Matsumoto, Yuji. 2007. Extracting Aspect -Evaluation and Aspect-of Relations in Opinion Mining. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1065–1074. Prague.
McKeown, Kathleen. 1985. Text Generation. Cambridge University Press.New York, USA.
Popescu, Ana-Maria and Etzioni, Oren. 2005. Extracting Product Features and Opinions from Reviews. Proceedings of Human Language TechnologyConference and Conference on Empirical Methods in Natural LanguageProcessing HLT/EMNLP), pages 339–346. Vancouver.
Panagiotopoulos, George; Giannakopoulos, George and Liapis, Antonios. 2019. A Study on Video Game Review Summarization. Proceedings of the Multiling 2019 Workshop, co-located with the RANLP 2019 conference, pages 35–43. Varna, Bulgaria.
Pontiki, Maria; Galanis, Dimitrios; Papageorgiou, Haris; Androutsopoulos, Ion; Manandhar, Suresh; AL-Smadi, Mohammad; Al-Ayyoub, Mahmoud; Zhao, Yanyan; Qin, Bing; Clercq, Orphée De; Hoste, Véronique; Apidianaki, Marianna; Tannier, Xavier; Loukachevitch, Natalia; Kotelnikov, Evgeny; Bel, Nuria; Jiménez-Zafra, Salud María; Eryiğit, Gülşen. 2016. SemEval-2016 Task 5: Aspect Based Sentiment Analysis. Proceedings of SemEval-2016, pages 19– 30. San Diego, California.
Pontiki, Maria; Galanis, Dimitrios; Pavlopoulos, John; Papageorgiou, Haris; Androutsopoulos, Ion and Manandhar, Suresh. 2014. SemEval-2014 Task 4 : Aspect Based Sentiment Analysis. Proceedings of the 8th Interna-tional Workshop on Semantic Evaluation (SemEval 2014), pages 27–35. Dublin, Ireland.
Radford, Alec; Narasimhan, Karthik; Salimans, Tim and Sutskever, Ilya. 2018. Improving Language Understanding by Generative Pre-Training. https://openai.com/blog/language-unsupervised/
Radford, Alec; Wu, Jeffrey; Child, Rewon; Luan, David; Amodei, Dario and Sutskever, Ilya. 2019. Language Models are Unsupervised Multitask Learners. https://openai.com/blog/better-language-models/
Russell, Stuart and Norvig, Peter. Artificial Intelligence: A Modern Approach 2nd. Prentice Hall. 2003 [1995]. ISBN 978-0137903955.
Sellam, Thibault; Das, Dipanjan; Parikh, Ankur P. 2020. BLEURT: Learning Robust Metrics for Text Generation. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7881–7892. New York, USA.
Sutskever, Ilya; Vinyals, Oriol and Le, Quoc V. 2014. Sequence to Sequence Learning with Neural Networks. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, pages 3104–3112.
Tamchyna, Aleš and Veselovská, Kateřina. 2016. UFAL at SemEval-2016 Task 5: Recurrent Neural Networks for Sentence Classification. Proceed-ings of SemEval-2016, pages 367–371. San Diego, California.
Toh, Zhiqiang and Su, Jian. 2016. NLANGP at SemEval-2016 Task 5: Improv-ing Aspect Based Sentiment Analysis using Neural Network Features.Proceedings of SemEval-2016, pages 282–288. San Diego, California.
Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N; Kaiser, Łukasz and Polosukhin, Illia. 2017. Attention Is All You Need. 31st Conference on Neural Information Processing Systems (NIPS 2017). California, USA.
Wagner, Joachim; Arora, Piyush; Cortes, Santiago; Barman, Utsab; Bogdanova, Dasha; Foster, Jennifer and Tounsi, Lamia. 2014. DCU: Aspect-based Polarity Classification for SemEval Task 4. Proceedings ofthe 8th International Workshop on Semantic Evaluation (SemEval2014), pages 223–229. Dublin, Ireland.
Zhang, Tianyi; Kishore, Varsha; Wu, Felix; Weinberger, Kilian Q. and Artzi, Yoav. 2020. BERTSCORE: EVALUATING TEXT GENERATION WITH BERT. ICLR 2020 Conference.