簡易檢索 / 詳目顯示

研究生: 曾相利
Indra Pramana
論文名稱: Sentiment Analysis of Movie Reviews with Deep Learning Methods
Sentiment Analysis of Movie Reviews with Deep Learning Methods
指導教授: 侯文娟
Hou, Wen-Juan
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 50
中文關鍵詞: movie reviewsentiment analysisCNNLSTMBLSTMword embeddingnatural language processingdeep learningneural network
英文關鍵詞: movie review, sentiment analysis, CNN, LSTM, BLSTM, word embedding, natural language processing, deep learning, neural network
DOI URL: http://doi.org/10.6345/THE.NTNU.DCSIE.010.2019.B02
論文種類: 學術論文
相關次數: 點閱:238下載:34
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Sentiment analysis is one of the most popular and important research field in natural language processing (NLP). The purpose of this thesis is to propose a deep learning neural network for polarity sentiment analysis of movie reviews. Preparation data is the foundation to build the sentiment analysis model. In this phase NLP techniques will be useful.
    Preprocessing for the data has been implemented in this work. In this study, we focus to measure semantic similarity between words and the system will learn word embedding by the data for fitting the neural network to create a sentiment analysis classification model of movie reviews which can predict the outputs of positive or negative opinions on the documents.
    Our experiment is to creates 5 models of neural networks for comparison to achieve a better result. Long-Short Term Memory (LSTM) is used because the memory cell can memorize the long term of words, and carry the previous information to current input. Furthermore, Bidirectional LSTM (BLSTM) is used which can carry information from the past and the future. Besides, Convolutional Neural Network (CNN) is also experimented in this study. We make a comparison between the networks of single LSTM, BLSTM, CNN-LSTM, CNN-BLSTM and CNN. Finally, we have successfully to achieve a high accuracy for this study. BLSTM network achieves the best performance of accuracy (89.39%) and F1 score (89.99%).

    Abstract………………………………i Acknowledgment………………………iii Table of Content……………………iv List of Table…………………………vi List of Figure………………………vii 1. Introduction………………………1 2. Related Works………………………3 2.1. Sentiment Analysis Model…………………………3 2.2. Machine Learning Approaches…………………………5 2.2.1. Supervised Learning…………………………6 2.2.2. Deep Learning………………………6 2.2.2.1. Long Short Term Memory (LSTM)…………………………6 2.2.2.2. Convolutional Neural Network (CNN)…………………10 3. System Architecture………………………12 3.1 Overall Architecture……………………12 3.2 Data Collections………………………13 3.3 Preprocessing………………………15 3.4 Feature Extraction…………………………17 3.4.1. Pre-train Word Embedding…………………………17 3.4.2. Zero Padding……………………………19 3.4.3. Schema of process obtaining features………………………20 3.5 Neural Network………………………22 3.6 Improving performance and reducing overfitting for Neural Network ………………28 3.6.1. Learning rate ……………… 28 3.6.2. Batch normalization ……………… 29 3.6.3. Network structure ……………… 29 3.6.4. Dropout ……………… 29 3.6.5. Optimization and loss function ……………… 29 4. Experimental Result and Discussion………………………………30 4.1. LSTM configuration and results…………………………………30 4.2. BLSTM configuration and results………………………………32 4.3. CNN-LSTM configuration and results……………………………34 4.4. CNN-BLSTM configuration and results…………………………36 4.5. CNN configuration and results…………………………………37 4.6. Evaluation measure………………………39 4.7. Configuration details……………………41 4.8. History training for the best performance model…………42 4.9. Samples of positive and negative reviews for testing result……………………43 4.10. The Test Result from American Made movie review…………44 4.11. Discussion…………………………47 5. Conclusion and future work………………………48 Reference……………………49

    Bhardwaj, A., Wei, J.N., Wei, D.,“Deep Learning Essentials,” 2018

    Goldberg, Y., and Levy, O. word2vec explained: deriving Mikolov et al.’s negative-sampling word embedding method. arXiv preprint arXiv: 1402.3722, 2014

    Hochreiter. S, and Schmidhuber .J, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp.1735–1780, 1997

    Lin, D.G, “TensorFlow+Keras,” 2017.

    Liu. B., “Web Data Mining, Second Edition”, Springer Berlin Heidelberg, 2011.

    Maas, A.L., Daly, R.E., Pham, P.T, D. Huang, Ng. A.Y. Potts, C. "Learning word vectors for sentiment analysis", Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142-150, 2011.

    Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications : A survey. Ain Shams Engineering Journal, 5(4), 1093–1113.

    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. 2013b. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111–3119

    Mohri, Mehryar & Rostamizadeh, Afshin & Talwalkar, Ameet., “Foundations of Machine Learning,” 2012

    Nishani, Eralda & Cico, Betim. (2017). Computer vision approaches based on deep learning and neural networks: Deep neural networks for video analysis of human pose estimation. 1-4. 10.1109/MECO.2017.7977207.

    Omer Levy and Yoav Goldberg. 2014. Dependency-based word embeddings. In ACL (2), pages 302– 308
    S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997
    S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997
    S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997

    Raschka, S., and Mirjali, V. “Python Machine Learning, 2nd ed.,” 2017

    Yenter, Alec and Abhishek Verma.“Deep CNN-LSTM with combined kernels from multiple branches for IMDb review sentiment analysis.” 2017 IEEE 8th Annual

    Ubiquitous Computing, Electronics, and Mobile Communication Conference(UEMCON)(2017): 540-546.

    Zheng, J., “Natural Language Processing, Third ed.”, TopTeam Information, 2018.

    https://radimrehurek.com/gensim/
    https://www.nltk.org/

    下載圖示
    QR CODE