簡易檢索 / 詳目顯示

研究生: 林融
Lin, Jung
論文名稱: 網路論壇中文諷刺意圖偵測
Sarcasm Detection in Mandarin Online Discourse
指導教授: 陳正賢
Chen, Alvin Cheng-Hsien
口試委員: 張瑜芸
Chang, Yu-Yun
許展嘉
Hsu, Chan-Chia
陳正賢
Chen, Alvin Cheng-Hsien
口試日期: 2023/03/22
學位類別: 碩士
Master
系所名稱: 英語學系
Department of English
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 102
中文關鍵詞: 諷刺特徵諷刺偵測機器學習網路論壇計算語用學
英文關鍵詞: linguistically motivated sarcasm cues, sarcasm detection, machine learning, online forums, computational pragmatics
研究方法: 機器學習
DOI URL: http://doi.org/10.6345/NTNU202300388
論文種類: 學術論文
相關次數: 點閱:201下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究旨在尋找能有效幫助自動化諷刺偵測並具備語言學學理支持的諷刺特徵。本 研究收集以新冠肺炎為主題的網路論壇語料庫為對象,分析其中的留言內容,探討論壇 用戶如何洞察到諷刺言論的出現。若特定語言表達可以使論壇用戶察覺留言者諷刺他人, 則我們將此類語言表達稱為「諷刺特徵」。這些特徵可以分為留言層次諷刺特徵 (comment level) 和上下文層次諷刺特徵 (contextual level)。此外根據不同的提示諷刺手法,我們可以將不同諷刺特徵歸類。我們將這些諷刺特徵作為機器學習實驗分類重要依據。我們的結果顯示這些諷刺特徵對於建置諷刺檢測模型是有效的,並可提高詞袋模型 (bag-of-wordsmodel) 的表現。研究結果顯示,在辨識諷刺言論時,留言層次諷刺特徵和 上下文層次諷刺特徵同等重要。這表明我們的諷刺特徵具備兩種特性。第一,某些特徵 為特定語句,並與諷刺有著密切的關聯。第二,某些特徵則涵蓋前文內容,並藉由分析 前文語境來判斷發言者/作者的諷刺意圖。此外,我們也進一步發現了有效的諷刺特徵 及其策略。最後,在進行諷刺留言的情感分析時,我們注意到留言和其語境(如貼文或 先前留言)的正負情感對比有助於識別諷刺,並且發現諷刺不一定是在負面語境下的正 面話語,而是可能出現在正面語境下的負面/中立話語中。這一結果表明,在識別諷刺 時,人們應該要更注重不同的諷刺表現方式。本研究提供有效的諷刺特徵和提供諷刺標 註的程序和特徵選擇的依據。最後,本研究所提供的諷刺特徵和其策略歸類有助於辨識中文諷刺。

    This study aimed to investigate linguistically motivated cues for identifying sarcastic utterances automatically, using a COVID-related news corpus. The cues were categorized into comment and contextual levels. The linguistically motivated sarcasm cues could be also further categorized based on how they arouse the hearer's/reader's feelings that sarcasm was present. We applied these linguistically motivated sarcasm cues as input models in machine learning experiments to test their effectiveness. The results of the experiments showed that linguistically motivated sarcasm cues were effective in identifying sarcasm and that they are able to improve the performance of bag-of-words models. Results showed that both comment and contextual level cues were important for identifying sarcastic utterances, indicating the model's ability to identify utterances with a strong association with sarcasm and those that required prior context. Effective strategies and cues were identified. Moreover, polarity incongruity was found to be helpful in identifying sarcastic utterances, which were not necessarily positive in a negative context but could be negative or neutral in a positive context. This study contributes to identifying Mandarin sarcastic utterances by offering effective cues and outlining helpful procedures for sarcasm annotations and cue selection.

    ACKNOWLEDGEMENTS i CHINESE ABSTRACT iv ENGLISH ABSTRACT v TABLE OF CONTENTS vi LISTS OF TABLES viii LISTS OF FIGURES x Chapter 1 Introduction 1 1.1 Research background 1 1.2 Research motivation and research gaps 2 1.3 Significance of the study 4 1.4 Organization of the thesis 5 Chapter 2 Literature review 6 2.1 Defining sarcasm 6 2.1.1 Understanding implicatures 7 2.1.2 Understanding sarcasm as a form of irony 8 2.1.3 Sarcasm in Mandarin 11 2.2 Linguistically motivated sarcasm cues 14 2.3 Computational methods of sarcasm detection 20 2.3.1 English sarcasm detection summaries 21 2.3.2 Mandarin sarcasm detection summaries 27 2.4 Interim summary 33 Chapter 3 Methodology 35 3.1 Data collection 35 3.2 Data preprocessing 41 3.2.1 Data cleaning 41 3.2.2 Word segmentation 43 3.3 Sarcasm annotation 45 3.4 Operational definition of the features 48 3.4.1 Comment level cues 48 3.4.2 Contextual level cues 56 3.5 Experimental design 61 3.6 Model training 63 3.7 Model evaluation 65 Chapter 4 Results 67 4.1 The effectiveness of sarcasm cues 67 4.2 The effectiveness of sarcasm cues with ngrams 69 4.3 Feature importance analysis of sarcasm cues in Model CMT and Model BCMT 73 4.4 Interim summary 80 Chapter 5 General Discussion 82 5.1 The need of linguistically motivated sarcasm cues 82 5.2 Linguistically motivated sarcasm cues and ngrams 83 5.3 The effectiveness of antiphrasis-using cues 86 Chapter 6 Conclusion 89 6.1 Summary 89 6.2 Limitation and future research 90 REFERENCES 91 APPENDIX 97

    Agarwal, A., Xie, B., Vovsha, I., Rambow, O., & Passonneau, R. J. (2011, June 23, 2011). Sentiment analysis of twitter data. Paper presented at the Workshop on Language in Social Media (LSM 2011), Portland, OR.
    Berger, A. A. (2017). An anatomy of humor. New York, NY: Routledge.
    Burgers, C. (2011). Finding irony: An introduction of the verbal irony procedure (VIP). Metaphor and symbol, 26(3), 186-205. doi:http://dx.doi.org/10.1080/10926488.2011.583194
    Camp, E. (2012). Sarcasm, pretense, and the semantics/pragmatics distinction. Noûs, 46(4), 587-634. doi:https://doi.org/10.1111/j.1468-0068.2010.00822.x
    Camp, E., & Hawthorne, J. (2008). Sarcastic'like': A case study in the interface of syntax and semantics. Philosophical Perspectives, 22, 1-21.
    Chang, W.-L. M., & Haugh, M. (2020). The metapragmatics of “teasing” in Taiwanese Chinese conversational humour. The European Journal of Humour Research, 8(4), 7-30. doi:http://dx.doi.org/10.7592/EJHR2020.8.4.Chang
    Chen, L.-C. (2023). An improved corpus-based NLP method for facilitating keyword extraction: An example of the COVID-19 vaccine hesitancy corpus. Sustainability, 15(4), 1-19. doi:https://doi.org/10.3390/su15043402
    Chen, Y. L. (2016). [Mingci + men] de yansheng xinci ji fanfeng secai [[Noun + men]’s generative new words and its ironic meaning]. Guojiaoxinzhi [National Education New Knoweldge], 63(1), 15-23.
    Clark, H. H., & Marshall, C. R. (1981). Definite knowledge and mutual knowledge. In A. K. Joshi, B. L. Webber, & I. A. Sag (Eds.), Elements of Discourse Understanding. (pp. 10-63). Cambridge, UK: Cambridge University Press.
    Colston, H. L. (2007). On Necessary Conditions for Verbal Irony Comprehension. In R. W. Gibbs Jr. & H. L. Colston (Eds.), Irony in language and thought: A cognitive science reader (pp. 97-134). New York, NY: Lawrence Erlbaum Associates.
    Colston, H. L. (2017). Irony and sarcasm. In S. Attardo (Ed.), The routledge handbook of language and humor (pp. 234-249). New York, NY: Taylor & Francis.
    Dynel, M. (2013). Irony from a neo-Gricean perspective: On untruthfulness and evaluative implicature. Intercultural pragmatics, 10(3), 403-431. doi:https://doi.org/10.1515/ip-2013-0018
    Dynel, M. (2017). Academics vs. American scriptwriters vs. Academics: A battle over the etic and emic “sarcasm” and “irony” labels. Language & Communication, 55, 69-87. doi:https://doi.org/10.1016/j.langcom.2016.07.008
    Eke, C., Norman, A. A., Shuib, L., & Long, Z. A. (2022). Random forest-based classifier for automatic sarcasm classification on Twitter data using multiple features. Journal of Information Systems and Digital Technologies, 4(2), 125-145.
    Farias, D. I. H., & Rosso, P. (2017). Irony, sarcasm, and sentiment analysis. In F. A. Pozzi, E. Fersini, E. Messina, & B. Liu (Eds.), Sentiment analysis in social networks (pp. 113-128). Cambridge, MA: Elsevier.
    Fowler, H. W. (Ed.) (2009) A dictionary of modern English usage. New York, NY: Oxford University Press.
    Fowler, H. W. (Ed.) (2009) A dictionary of modern English usage. New York, NY: Oxford University Press.
    Ghosh, A., Li, G., Veale, T., Rosso, P., Shutova, E., Barnden, J., & Reyes, A. (2015, June 4-5, 2015). Semeval-2015 task 11: Sentiment analysis of figurative language in twitter. Paper presented at the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, CO.
    González-Ibánez, R., Muresan, S., & Wacholder, N. (2011, June 19-24, 2011). Identifying sarcasm in twitter: A closer look. Paper presented at the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR.
    Govindan, V., & Balakrishnan, V. (2022). A machine learning approach in analysing the effect of hyperboles using negative sentiment tweets for sarcasm detection. Journal of King Saud University-Computer and Information Sciences, 34(8), 5110-5120.
    Grice, H. P. (1975). Logic and conversation. Syntax and Semantics, 3, 41-58.
    Grice, H. P. (1978). Further notes on logic and conversation. Syntax and Semantics, 9, 113-127.
    Guo, L. M. (2007). A pragmatic study on the speech act of Chinese sarcasm. (Thesis). University of Jinan,
    Hancock, J. T. (2004). Verbal irony use in face-to-face and computer-mediated conversations. Journal of Language and Social Psychology, 23(4), 447-463. doi:https://doi.org/10.1177/0261927X04269587
    How Gen Z Uses Emoji: A Guide For Millennials (2022). Retrieved from https://www.dictionary.com/e/gen-z-explains-emoji-to-millennials/
    Huang, J. L., Chung, C. K., Hui, N., Lin, Y. Z., Xie, Y. T., Lam, B. C., . . . Pennebaker, J. W. (2012). The Development of the Chinese Linguistic Inquiry and Word Count Dictionary Chinese Journal of Psychology, 54(2), 185-201.
    Huang, Y. (2014a). Pragmatics. New York, NY: Oxford University Press.
    Huang, Y. (2014b). Implicature. In Pragmatics (pp. 27-83). Oxford, UK: Oxford University Press.
    Jia, X., Deng, Z., Min, F., & Liu, D. (2019). Three-way decisions based feature fusion for Chinese irony detection. International Journal of Approximate Reasoning, 113, 324-335. doi:https://doi.org/10.1016/j.ijar.2019.07.010
    Kreuz, R. J. (2018). The use of verbal irony: Cues and constraints. In J. S. Mio & A. N. Katz (Eds.), Metaphor: Implications and applications (pp. 23-38). New York, NY: Psychology Press.
    Kreuz, R. J. (2020). Irony and sarcasm. Cambridge, MA: The MIT Press.
    Kreuz, R. J., & Caucci, G. (2007, April 26, 2007). Lexical influences on the perception of sarcasm. Paper presented at the Workshop on Computational Approaches to Figurative Language, Rochester, NY.
    Kreuz, R. J., & Glucksberg, S. (1989). How to be sarcastic: The echoic reminder theory of verbal irony. Journal of experimental psychology: General, 118(4), 374-386. doi:https://doi.org/10.1037/0096-3445.118.4.374
    Kumar, R., & Bhat, A. (2021, May 21-23, 2021). An analysis on sarcasm detection over twitter during COVID-19. Paper presented at the 2021 2nd International Conference for Emerging Technology (INCET), Belgaum, India.
    Kumon-Nakamura, S., Glucksberg, S., & Brown, M. (1995). How about another piece of pie: The allusional pretense theory of discourse irony. Journal of experimental psychology: General, 124(1), 3-21. doi:https://doi.org/10.1037//0096-3445.124.1.3
    Lee, C. J., & Katz, A. N. (1998). The differential role of ridicule in sarcasm and irony. Metaphor and symbol, 13(1), 1-15.
    Li, A.-R., Chersoni, E., Xiang, R., Huang, C.-R., & Lu, Q. (2019, September 13-15, 20). On the “easy” task of evaluating Chinese irony detection. Paper presented at the Pacific Asia Conference on Language, Information and Computation (PACLIC 33),, Hakodate, Japan.
    Liang, B., Lin, Z., Qin, B., & Xu, R. (2022, October 28-30, 2022). Topic-oriented sarcasm detection: New task, new dataset and new method. Paper presented at the 21st Chinese National Conference on Computational Linguistics, Nanchang, China.
    Lin, S.-K., & Hsieh, S.-K. (2016, October 6-7, 2016). Sarcasm detection in Chinese using a crowdsourced corpus. Paper presented at the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016), Tainan, Taiwan.
    Lin, S.-P. (2017). Wanglu bagua yu zhenli zhengzhi pititi baguaban zhi yueting [An Audience Study on PTT Gossiping and the Politics of Truth]. Mass Communication Research, 133, 135-188. doi:https://doi.org/10.30386/MCR.201710_(133).0004
    Ling, J., & Klinger, R. (2016). An empirical, quantitative analysis of the differences between sarcasm and irony. In H. Sack, G. Rizzo, N. Steinmetz, D. Mladenic, S. Auer, & C. Lange (Eds.), The semantic web (pp. 203-216). Cham, Switzerland: Springer.
    Littman, D. C., & Mey, J. L. (1991). The nature of irony: Toward a computational model of irony. Journal of Pragmatics, 15(2), 131-151. doi:https://doi.org/10.1016/0378-2166(91)90057-5
    Liu, P., Chen, W., Ou, G., Wang, T., Yang, D., & Lei, K. (2014, June 16-18, 2014). Sarcasm detection in social media based on imbalanced classification. Paper presented at the International Conference on Web-Age Information Management, Macau, China.
    Lu, T.-Y. (2017). Development of new-word extraction package in R with application in PTT articles. (Master). Tamkang University,
    Michaelis, L. A., & Feng, H. (2015). What is this, sarcastic syntax? Constructions and Frames, 7(2), 148-180. doi:https://doi.org/10.1075/cf.7.2.01mic
    MOE. (Ed.) (2021) Revised Mandarin Chinese Dictionary. Ministry of Education, R.O.C.
    Nguyen, H., Moon, J., Paul, N., & Gokhale, S. S. (2021, September 30 - October 3, 2021). Sarcasm detection in politically motivated social media content. Paper presented at the 2021 IEEE International Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York, NY.
    Nixon, C. L. (2014). Current perspectives: the impact of cyberbullying on adolescent health. Adolescent health, medicine and therapeutics, 5, 143-158. doi:https://doi.org/10.2147/AHMT.S36456
    O’Driscoll, J. (2020). Offensive language: Taboo, offence and social control. London, UK: Bloomsbury Publishing.
    Onan, A., & Toçoğlu, M. A. (2021). A term weighted neural language model and stacked bidirectional LSTM based framework for sarcasm identification. IEEE Access, 9, 7701-7722. doi:https://doi.org/10.1109/ACCESS.2021.3049734
    Pan, Y. R. (2018). Bi sucicheng zisha de bingfei shi riben qiangtai zhe jian shi cai shi yakua zhe ming zishen waijiaoguan de zuihou yi gen daocao [What forced Su, Qi Cheng is not the strong Typhon in Japan! This is the last straw that breaks this senior diplomat’s back]. Retrieved from https://www.storm.mg/article/503049
    Partington, A. (2006). The linguistics of laughter: A corpus-assisted study of laughter-talk. New York, NY: Routledge.
    Pennebaker, J. W., Chung, C. K., Ireland, M., Gonzales, A., & Booth, R. J. (2007). Linguistic inquiry and word count: LIWC 2007. Mahway: Lawrence Erlbaum Associates, (71, 2007).
    Reimers, N., & Gurevych, I. (2019, November 3–7, 2019). Sentence-BERT: Sentence embeddings using siamese BERT-networks. Paper presented at the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Hong Kong, China.
    Searle, J. R. (1969). Speech acts: An essay in the philosophy of language (Vol. 626). New York, NY: Cambridge University Press.
    Shrivastava, M., & Kumar, S. (2021). A pragmatic and intelligent model for sarcasm detection in social media text. Technology in Society, 64, 1-9. doi:https://doi.org/10.1016/j.techsoc.2020.101489
    Son, L. h., Kumar, A., Sangwan, S. R., Arora, A., Nayyar, A., & Abdel-Basset, M. (2019). Sarcasm detection using soft attention-based bidirectional long short-term memory model with convolution network. IEEE Access, 7, 23319-23328. doi:https://doi.org/10.1109/ACCESS.2019.2899260
    Sperber, D. (1984). Verbal irony: Pretense or echoic mention? Journal of experimental psychology: General, 113(1), 130-136.
    Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., . . . Meteer, M. (2000). Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational linguistics, 26(3), 339-373.
    Strapparava, C., & Valitutti, A. (2004, May 26 - 28, 2004). Wordnet affect: an affective extension of wordnet. Paper presented at the Fourth International Conference on Language Resources and Evaluation (LREC’04), Lisbon, Portugal.
    Taboada, M. (2016). Sentiment analysis: An overview from linguistics. Annual Review of Linguistics, 2, 325-347. doi:https://doi.org/10.1146/annurev-linguistics-011415-040518
    Taiwan Internet Report. (2020). Retrieved from https://report.twnic.tw/2020/en/index.html
    Tang, Y.-j., & Chen, H.-H. (2014, August 23-29, 2014). Chinese irony corpus construction and ironic structure analysis. Paper presented at the 25th International Conference on Computational Linguistics: Technical Papers, Dublin, Ireland.
    Tien, A., Carson, L., & Jiang, N. (2021). An anatomy of Chinese offensive words: A lexical and semantic analysis. Cham, Switzerland: Palgrave Macmillan.
    Wang, J., & Taylor, C. (2019). The conventionalisation of mock politeness in Chinese and British online forums. Journal of Pragmatics, 142, 270-280.
    Wang, W.-j., Chen, C.-j., Lee, C.-m., Lai, C.-y., & Lin, H.-h. (2022a). Articut: Chinese Word Segmentation and POS Tagging System (Version 1.2.2). Retrieved from https://api.droidtown.co
    Wang, W.-j., Chen, C.-j., Lee, C.-m., Lai, C.-y., & Lin, H.-h. (2022b). Keymoji (Version 103). Retrieved from https://api.droidtown.co
    Whalen, J. M., Pexman, P. M., & Gill, A. J. (2009). “Should be fun—not!” Incidence and marking of nonliteral language in e-mail. Journal of Language and Social Psychology, 28(3), 263-280. doi:https://doi.org/10.1177/0261927X09335253
    Wilson, D., & Sperber, D. (2007). On verbal irony. In R. W. Gibbs Jr. & H. L. Colston (Eds.), Irony in language and thougt: A cognitive science reader (pp. 35-55). New York, NY: Lawrence Erlbaum Associates.
    Wu, C.-M., & Chuang, Y.-R. (2022). Yunyong shendu xuexi jianli zhongwen fanfeng bianshi moxing yu yuangzheng zhi yanjiu-yi zongtong daxuan houxuan ren FB fensixie wenben weili [Research on Establishing and Validating Chinese Irony Recognition Models by Deep Learning - the Case of 2020 Taiwan Presidential Candidate's Fan Page Corpus on FaceBook]. Management Information Computing, 11(1), 66-80. doi:https://doi.org/10.6285/MIC.202203_11(1).0006
    Wu, C. H. (2014). Wanlu niming suanmin wenhua [Anonymity in computational-mediated media, culture of sarcastic netizen]. Retrieved from https://castnet.nctu.edu.tw/castnet/article/6194?issueID=460
    Yang, M., & Lin, L.-H. (2020). CKIP. Retrieved from https://github.com/ckiplab/ckip-transformers

    無法下載圖示 電子全文延後公開
    2028/03/28
    QR CODE