簡易檢索 / 詳目顯示

研究生: 黃筱婷
HUANG, Hsiao-Ting
論文名稱: 改進提示學習訓練架構以偵測社交媒體文本之心理健康面向
Improving Prompt-based Learning Framework for Mental Health Aspect Detection from Social Media Content
指導教授: 柯佳伶
Koh, Jia-Ling
口試委員: 陳良弼
Chen, Arbee L. P.
吳宜鴻
WU, YI-HUNG
柯佳伶
Koh, Jia-Ling
口試日期: 2024/01/26
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 63
中文關鍵詞: 提示學習大規模預訓練語言模型心理健康面向
DOI URL: http://doi.org/10.6345/NTNU202400301
論文種類: 學術論文
相關次數: 點閱:92下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究針對從中文社交媒體文本分析發文者心理健康面向的需求,探討如何從發文內容自動偵測發文者的心理健康狀態,包括是否有精神疾病或情緒問題,以及是否出現尋求協助的行為等三個面向。本論文改良提示學習中的訓練架構,提出漸次增加與策略挑選訓練資料的方法,對預訓練遮蔽語言模型微調的訓練方式進行改進,稱為IS訓練策略。此外,為了加強模型區別正負類別資料間的差異,額外採計邊界差異損失值。本研究在臺灣電子佈告欄上蒐集的三個心理健康面向資料集上進行實驗,結果顯示結合本論文所提出的IS訓練策略,可使PET和iPET所訓練出的面向偵測器,在Precision至少提升10%。當控制訓練資料樣本數減少至一半的情況下採用IS訓練策略,相較於原訓練樣本數未採用IS訓練策略,各面向偵測器的Precision仍提升至少5%,顯示IS訓練策略能從訓練集中有效挑選出對增進模型正確率學習有幫助的資料。在模擬開放環境測試集中,各目標面向的偵測效果皆可達0.8以上,顯示本論文所提出之提示學習訓練架構用於偵測社交媒體文本中心理健康面向的實用性。

    第一章 緒論 1 1.1 研究動機與目的 1 1.2 論文方法 4 1.3 論文架構 6 第二章 文獻探討 7 2.1 文本分類的相關技術演進 7 2.1.1 採用非類神經網路模型之文本分類方法 7 2.1.2 採用類神經網路模型之文本分類方法 7 2.1.3 採用大規模預訓練語言模型之文本分類任務 8 2.2 提示學習 9 2.2.1 PET和iPET模型訓練架構 10 2.2.2 提示學習其他探討方向 13 2.3 心理健康素養 15 第三章 問題定義與資料處理 17 3.1 問題定義 17 3.2 句組段落資料轉換 20 第四章 以提示學習訓練心理健康面向偵測模型 24 4.1 提示學習訓練環境設定 24 4.2 提示學習訓練架構結合IS訓練策略 25 4.2.1 模型微調初始回合訓練方式 26 4.2.2 以多回合微調訓練MLM模型 29 4.2.3 半監督式學習階段 32 4.3 心理健康面向偵測之候選發文篩選方法 34 第五章 實驗評估與討論 37 5.1 資料集說明 37 5.1.1 標示資料集 37 5.1.2 開放環境測試資料集 38 5.2 實驗參數設定 39 5.3 評估指標 40 5.4 封閉標示資料集之實驗設計與結果討論 41 5.5 開放環境測試資料集之實驗設計與結果討論 50 第六章 結論與未來研究方向 54 參考文獻 56 附錄 59

    Cepeda, N.J., Pashler, H., Vul, E., Wixted, J.T., & Rohrer, D. (2006). Distributed practice in verbal recall tasks: A review and quantitative synthesis. Psychological bulletin, 132(3), 354-80.
    Chao, H., Lien, Y., Kao, Y., Tasi, I., Lin, H., & Lien, Y. (2020). Mental Health Literacy in Healthcare Students: An Expansion of the Mental Health Literacy Scale. International Journal of Environmental Research and Public Health, 17.
    Cui, G., Hu, S., Ding, N., Huang, L., & Liu, Z. (2022). Prototypical Verbalizer for Prompt-based Few-shot Tuning. In Proceedings of 60th Annual Meeting of the Association for Computational Linguistics, pages 7014–7024.
    Devlin, J., Chang, M., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
    Gao, T., Fisch, A., & Chen, D. (2021). Making Pre-trained Language Models Better Few-shot Learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pages 3816–3830.
    Harris, Z.S. (1954). Distributional Structure, WORD, 10:2-3 , pages 146-162.
    Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9, 1735-1780.
    Hambardzumyan, K., Khachatrian, H., & May, J. (2021). WARP: Word-level Adversarial ReProgramming. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 4921–4933.
    Jorm, A.F., Korten, A.E., Jacomb, P.A., Christensen, H., Rodgers, B., & Pollitt, P.A. (1997). “Mental health literacy”: a survey of the public's ability to recognise mental disorders and their beliefs about the effectiveness of treatment. Medical Journal of Australia, 166.
    Kluger, A.N., & Denisi, A.S. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119, 254-284.
    Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. In Conference on Empirical Methods in Natural Language Processing (EMNLP’14).
    Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (2014). A Convolutional Neural Network for Modelling Sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pages 655–665.
    Kutcher, S., Wei, Y., & Coniglio, C. (2016). Mental health literacy: Past, present, and future. In The Canadian Journal of Psychiatry / La Revue canadienne de psychiatrie, 61(3), pages 154–158.
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. In Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324.
    Liu, C., Sheng, Y., Wei, Z., & Yang, Y. (2018). Research of Text Classification Based on Improved TF-IDF Algorithm. 2018 IEEE International Conference of Intelligent Robotic and Control Engineering (IRCE), 218-222.
    Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2021). Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55, 1 - 35.
    Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., & Tang, J. (2021). GPT Understands, Too. ArXiv, abs/2103.10385.
    Mojtabai, R., Evans-Lacko, S., Schomerus, G., & Thornicroft, G. (2016). Attitudes Toward Mental Health Help Seeking as Predictors of Future Help-Seeking Behavior and Use of Mental Health Treatments. Psychiatric services, 67 6, 650-7 .
    O’Connor, M., & Casey, L.M. (2015). The Mental Health Literacy Scale (MHLS): A new scale-based measure of mental health literacy. Psychiatry Research, 229, 511-516.
    Salton, G., & Buckley, C. (1988). Term-Weighting Approaches in Automatic Text Retrieval. Information Processing & Management, 24(5):513–523.
    Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., & Singh, S. (2020). Eliciting Knowledge from Language Models Using Automatically Generated Prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4222–4235.
    Schick, T., & Schütze, H. (2020). Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL’21), Paola Merlo, Jörg Tiedemann, and Reut Tsarfaty (Eds.). Association for Computational Linguistics, pages 255–269.
    Schick, T., Schmid, H., & Schütze, H. (2020). Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification. In Proceedings of the 28th International Conference on Computational Linguistics (COLING’20), Donia Scott, Núria Bel, and Chengqing Zong (Eds.). International Committee on Computational Linguistics, pages 5569–5578.
    Scao, T.L., & Rush, A.M. (2021). How many data points is a prompt worth? In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, pages 2627–2636.
    Turcan, E., & McKeown, K. (2019). Dreaddit: A Reddit Dataset for Stress Analysis in Social Media. In Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019), pages 97–107, Hong Kong. Association for Computational Linguistics.
    Tai, K.S., Socher, R., & Manning, C.D. (2015). Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pages 1556–1566, Beijing, China. Association for Computational Linguistics.
    Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. In Advances in Neural Information Processing Systems, pages 5998–6008.
    Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S.R. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 353–355, Brussels, Belgium. Association for Computational Linguistics.
    Zhu, X., Sobhani, P., & Guo, H. (2015). Long Short-Term Memory Over Recursive Structures. In Proceedings of the International Conference on Machine Learning. 1604–1612.
    Gao, J., Pantel, P., Gamon, M., He, X., & Deng, L. (2014, October). Modeling interestingness with deep neural networks. In Proceedings of the 2014 conference on empirical methods in natural language processing, pages 2-13.
    Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, pages 345-420

    下載圖示
    QR CODE