Author: |
曾相利 Indra Pramana |
---|---|
Thesis Title: |
Sentiment Analysis of Movie Reviews with Deep Learning Methods Sentiment Analysis of Movie Reviews with Deep Learning Methods |
Advisor: |
侯文娟
Hou, Wen-Juan |
Degree: |
碩士 Master |
Department: |
資訊工程學系 Department of Computer Science and Information Engineering |
Thesis Publication Year: | 2019 |
Academic Year: | 107 |
Language: | 英文 |
Number of pages: | 50 |
Keywords (in Chinese): | movie review 、sentiment analysis 、CNN 、LSTM 、BLSTM 、word embedding 、natural language processing 、deep learning 、neural network |
Keywords (in English): | movie review, sentiment analysis, CNN, LSTM, BLSTM, word embedding, natural language processing, deep learning, neural network |
DOI URL: | http://doi.org/10.6345/THE.NTNU.DCSIE.010.2019.B02 |
Thesis Type: | Academic thesis/ dissertation |
Reference times: | Clicks: 178 Downloads: 34 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
Sentiment analysis is one of the most popular and important research field in natural language processing (NLP). The purpose of this thesis is to propose a deep learning neural network for polarity sentiment analysis of movie reviews. Preparation data is the foundation to build the sentiment analysis model. In this phase NLP techniques will be useful.
Preprocessing for the data has been implemented in this work. In this study, we focus to measure semantic similarity between words and the system will learn word embedding by the data for fitting the neural network to create a sentiment analysis classification model of movie reviews which can predict the outputs of positive or negative opinions on the documents.
Our experiment is to creates 5 models of neural networks for comparison to achieve a better result. Long-Short Term Memory (LSTM) is used because the memory cell can memorize the long term of words, and carry the previous information to current input. Furthermore, Bidirectional LSTM (BLSTM) is used which can carry information from the past and the future. Besides, Convolutional Neural Network (CNN) is also experimented in this study. We make a comparison between the networks of single LSTM, BLSTM, CNN-LSTM, CNN-BLSTM and CNN. Finally, we have successfully to achieve a high accuracy for this study. BLSTM network achieves the best performance of accuracy (89.39%) and F1 score (89.99%).
Bhardwaj, A., Wei, J.N., Wei, D.,“Deep Learning Essentials,” 2018
Goldberg, Y., and Levy, O. word2vec explained: deriving Mikolov et al.’s negative-sampling word embedding method. arXiv preprint arXiv: 1402.3722, 2014
Hochreiter. S, and Schmidhuber .J, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp.1735–1780, 1997
Lin, D.G, “TensorFlow+Keras,” 2017.
Liu. B., “Web Data Mining, Second Edition”, Springer Berlin Heidelberg, 2011.
Maas, A.L., Daly, R.E., Pham, P.T, D. Huang, Ng. A.Y. Potts, C. "Learning word vectors for sentiment analysis", Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142-150, 2011.
Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications : A survey. Ain Shams Engineering Journal, 5(4), 1093–1113.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. 2013b. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111–3119
Mohri, Mehryar & Rostamizadeh, Afshin & Talwalkar, Ameet., “Foundations of Machine Learning,” 2012
Nishani, Eralda & Cico, Betim. (2017). Computer vision approaches based on deep learning and neural networks: Deep neural networks for video analysis of human pose estimation. 1-4. 10.1109/MECO.2017.7977207.
Omer Levy and Yoav Goldberg. 2014. Dependency-based word embeddings. In ACL (2), pages 302– 308
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997
Raschka, S., and Mirjali, V. “Python Machine Learning, 2nd ed.,” 2017
Yenter, Alec and Abhishek Verma.“Deep CNN-LSTM with combined kernels from multiple branches for IMDb review sentiment analysis.” 2017 IEEE 8th Annual
Ubiquitous Computing, Electronics, and Mobile Communication Conference(UEMCON)(2017): 540-546.
Zheng, J., “Natural Language Processing, Third ed.”, TopTeam Information, 2018.
https://radimrehurek.com/gensim/
https://www.nltk.org/