簡易檢索 / 詳目顯示

研究生: 葉伊婷
Yeh, Yi-Ting
論文名稱: 以語音辨識系統診斷高中生發音困難之評估研究
An Evaluation Study on Using an Automatic Speech Recognition System to Identify EFL Students’ Pronunciation Problems
指導教授: 陳浩然
Chen, Hao-Jan
學位類別: 碩士
Master
系所名稱: 英語學系
Department of English
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 122
中文關鍵詞: 語音辨識英語為外語學習者口說困難發音人機評分比較觀感
英文關鍵詞: Automatic speech recognition, EFL learners, Speaking, Difficult pronunciation, ASR and human ratings, Perceptions toward ASR
DOI URL: http://doi.org/10.6345/NTNU201900721
論文種類: 學術論文
相關次數: 點閱:158下載:52
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 對英語為外語的學習者來說,英語口說能力一直都是相當重要的技能。然由於諸多限制,口說能力的涵養時常受忽略且甚少被教學。隨著語音辨識技術(即ASR,Automatic Speech Recognition)的發展,教師們開始有更多機會鍛鍊學生的口語能力。本研究之目的,即是探究學習者如何與一個載有語音辨識技術的免費網站LearnMode互動,此網站不僅讓教師能自行設計練習題,也能幫助在學生低焦慮感的環境中,逐漸熟習開口說英語。對著ASR系統練習同時,此系統也在為學習者診斷其問題發音,並以顏色標記或評語的方式,提供學習者即時的修正性回饋。在本研究中,總計有66名高中學生成為受試者,他們完成了20個根據易混淆母音/子音設計的口說任務。此外,受試者也在完成任務後填寫了一份問卷,當中10位更接受了一對一的訪談,藉此深入了解他們對於語音辨識科技的觀感與態度。
      研究結果顯示,語音辨識工具與兩名老師之偵錯具有相當高的一致性,自二十個單元隨機抽樣出之五個單元中,有四個單元ASR工具與老師有超過八成的相似度。亦即,ASR工具與老師分別找出的發音錯字,有百分之八十五是相同的。此外,根據問卷與訪談結果也可得知,學生樂於在練口說時有ASR系統的協助,也因身處無同儕、師長壓力環境而更願意開口練習;不過他們仍渴望有老師在一旁,幫助他們即時改善有問題的發音。有關系統所提供的即時回饋機制,受試者認為顏色標記是有助益的,但他們也希望能有進一步的引導改善的指示。希上述研究成果能對有意協助學生提升口說能力的教師,以及計畫發展語言學習相關之語音辨識技術的研究者有綿薄貢獻。

    English speaking ability has been highly recognized as an important skill for EFL learners. Due to many constraints, though, the speaking skill cultivation is often neglected and seldom taught at school. With the developing of automatic speech recognition technologies, teachers can provide students with more opportunities to train their oral abilities. The current study investigates how learners interact with a website named LearnMode, enhanced by automatic speech technology which allows teachers to create their own speaking exercises and helps EFL learners get accustomed to speaking English under a low anxiety environment. In the meantime, the ASR system can diagnose their problematic pronunciation and offer immediate corrective feedback in the form of color highlight and comments. There are in total 66 senior high school students invited to complete 20 tasks with regard to difficult pairs of vowels and consonants. One questionnaire and one-on-one interviewed are also administered to probe into the learners’ perceptions and attitudes of the speech technologies.
       The result indicates that there is a high degree of agreement of the error detection between the automatic speech recognition system and human raters. In five randomly selected units among twenty, there are four units showing that over eighty-five percent of mispronounced words located by ASR technology and teachers respectively are the same. Also, it is shown that learners enjoy ASR assistance and are more willing to speak English but they still want teachers to help them refine their problematic sounds. With regard to the immediate feedback mechanism, participants consider the color highlight helpful, but they would love to have further instructions on how to make the adjustment. These findings can serve as useful information for teachers who would like to incorporate speaking enhancement into their teaching and for researchers who intend to develop better ASR technologies for language teaching and learning.

    Table of Contents CHAPTER ONE 1 Introduction 1 1.1 Research Background 1 1.2 Problem Statement 3 1.3 Purpose of Study 6 1.4 Significance of Study 10 1.5 Definition of Terms 12 CHAPTER TWO 13 Literature Review 13 2.1 Automatic Speech Recognition Technology 13 2.1.1 A Revolutionary Character in CAPT 13 2.1.2 The Discrepancy Issue between ASR Systems and Human Raters 15 2.2 Automatic Speech Recognition for Pronunciation Instruction 17 2.2.1 Limitations in the Past 17 2.2.2 An Affectionate and Lovely Beginning 19 2.2.3 Deeper Investigation into the Principles for ASR-based CAPT 19 2.2.4 Diving into Procedures in ASR-based CAPT 22 2.3 ASR-based English Learning Programs and Its Exercise Design 24 2.4 Different Types of Feedback in ASR Technologies 35 2.4.1 Summary of the Feedback Types in ASR Programs 38 2.5 Automatic Speech Scoring and Its Accuracy 40 2.6 Learners’ Perceptions of Using Automatic Speech Technologies 42 2.7 The Summary of Literature Review 43 2.8 Research Questions 44 CHAPTER THREE 46 Methodology 46 3.1 Participants 46 3.2 Procedure of Study 48 3.3.1 The Oral Practice Task: 10 pairs of difficult pronunciation to EFL learners 50 3.3.2 LearnMode, a free learning website benefiting both teacher& students 51 3.3.3 Feedback form, Questionnaire and Interview 55 3.3.4 Human Raters and the Scoring 57 3.4 Data analysis procedure 58 3.4.1 Similarities and differences between the result of ASR and human raters 58 3.4.2 The feedback form and responses to the questionnaire and interviews 59 CHAPTER FOUR 60 Results and Discussion 60 4.1 Results Evaluated by Human Raters and the ASR Technology 60 4.2 The Problematic Words Ranked within 100th Detected by the ASR System 73 4.2.1 The 100th Problematic Words vs. the Basic 2000 Vocabulary of Junior High 76 4.3 The Relation between Participants’ Performance and the Text Nature 77 4.4 The participants’ responses to the questionnaires 79 4.4.1 Background information 79 4.4.2 Participants’ perceptions of four English skills 81 4.4.3 Participants’ emotions/ feelings while dealing with English speaking 81 4.4.4 Participants’ perceptions of feedback types and overall comments 82 4.4.5 Participants’ perceptions of the ASR system (open-ended question) 85 4.4.6 One-on-One Interviews 87 CHAPTER FIVE 95 Conclusion 95 5.1 Brief Summary and Pedagogical Implications 95 5.2 The Limitations and Future Research Directions 97 REFERENCES 100 APPENDICES 104 Appendix A. The Oral Practice Materials 104 Appendix B. Feedback Form 110 Appendix C. Questionnaire 113 Appendix D. One-on-one Open-ended Interview Questions 116 Appendix E. Mispronounced Wordlist and Its Error Rate 117

    Ahn, T. Y., & Lee, S. M. (2016). User experience of a mobile speaking application with automatic speech recognition for EFL learning. British Journal of Educational Technology, 47(4), 778-786.
    Ali, S. (2016). Towards the development of a comprehensive pedagogical framework for pronunciation training based on adapted automatic speech recognition systems. In EUROCALL 2016: COMMUNITIES AND CULTURE. Research-publishing. net.
    Anthony, L. (2019). AntConc (Version 3.5.8) [Computer Software]. Tokyo, Japan: Waseda University. Available from https://www.laurenceanthony.net/software
    Ashwell, T., & Elam, J. R. (2017). How Accurately Can the Google Web Speech API Recognize and Transcribe Japanese L2 English Learners' Oral Production?. Jalt Call Journal, 13(1), 59-76.
    Bernstein, J., & Franco, H. (1996). Speech recognition by computer. Principles of experimental phonetics. St. Louis: Mosby.
    Bueno Alastuey, M. C. (2011). Perceived benefits and drawbacks of synchronous voice-based computer-mediated communication in the foreign language classroom. Computer Assisted Language Learning, 24(5), 419-432.
    Chen, A.-H. (2012). Exploring the effectiveness of reinforcing pronunciation training, spoken language. In Proceedings from CALL 2012: 15th International CALL Conference – The Medium Matters (pp. 110–112), Taichung: Providence University.
    Celce-Murcia, Marianne, Donna Brinton and Janet M. Goodwin. (1996). Teaching pronunciation: A reference for teachers of English to speakers of other languages. Cambridge: Cambridge University Press.
    Chen, H. H. J. (2001). Evaluating five speech recognition programs for ESL learners. Papers from the ITMELT 2001 Conference.
    Chen, H. H. J. (2004). Automatic speech recognition and oral proficiency assessment. In Proceedings of International Conference on English Language Teaching Instruction and Assessment (pp. 85-102).
    Chen, H. H. J. (2011). Developing and evaluating an oral skills training website supported by automatic speech recognition technology. ReCALL, 23(1), 59-78.
    Chen, H. H. J. (2017). Developing a Speaking Practice Website by Using Automatic Speech Recognition Technology. In: Wu TT., Gennari R., Huang YM., Xie H., Cao Y. (eds) Emerging Technologies for Education. SETE 2016. Lecture Notes in Computer Science, vol 10108. Springer, Cham
    Chen, H. H. J., & Chen, L. W. C. (2018). Automated Speech Assessment. The TESOL Encyclopedia of English Language Teaching, 1-6.
    Chen K. T. (2004). An Investigation on the Impact of ASR Software Feedback on EFL College Students' Pronunciation Learning
    Chiu, T.-L., Liou, H.-C., & Yeh, Y. (2007). A study of web-based oral activities enhanced by automatic speech recognition for EFL college learning. Computer Assisted Language Learning, 20, 209–233.
    Cordier, D. (2009). Speech recognition software for language learning: Toward an evaluation of validity and student perceptions.
    Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2 teaching and research (Vol. 42). John Benjamins Publishing Company.
    Ehsani, F. & Knodt, E. (1998). Speech Technology in Computer-Assisted Language Learning: Strengths and Limitations of a New CALL Paradigm. Language Learning & Technology, 2(1), 45-60.
    Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic speech recognition technology as an effective means for teaching pronunciation. JALT CALL Journal, 10(1), 21-47.
    Eskénazi, M. (1999) "Using a Computer in Foreign Language Pronunciation Training: What Advantages?", Tutors that Listen: Speech Recognition for Language Learning, Special Issue, CALICO Journal 16, 3: 447-469.
    Gaida, C., Lange, P., Petrick, R., Proba, P., Malatawy, A., & Suendermann-Oeft, D. (2014). Comparing open-source speech recognition toolkits. Tech. Rep., DHBW Stuttgart.
    Gilakjani, A. P. (2012). The Significance of Pronunciation in English Language Teaching. English Language Teaching, 5(4), 96-108.
    Golonka, E., Bowles, A., Frank, V., Richardson, L. & Freynik, S. (2012). Technologies for foreign language learning: a review of technology types and their effectiveness. Computer Assisted Language Learning, 1-36.
    http://www2.elc.polyu.edu.hk/conference/papers2001/chen.htm.
    Hincks, R. (2003). Speech technologies for pronunciation feedback and evaluation. ReCALL, 15(1), 3-20.
    Hsu, L. (2016). An empirical examination of EFL learners' perceptual learning styles and acceptance of ASR-based computer-assisted pronunciation training. Computer Assisted Language Learning, 29(5), 881-900.
    Kelly, G. (2000). How to teach pronunciation. Harlow, U.K.: Longman.
    Kim, I. S. (2006). Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation. Educational Technology & Society, 9(1), 322-334.
    Kimura, T. (2013). Improvement of EFL learners’ speaking proficiency with a web-based CALL system. Glasgow WorldCall Papers, 141.
    Levis, J., & Suvorov, R. (2012). Automatic speech recognition. The encyclopedia of applied linguistics.
    Lin, C. Y. (2014). Perception and Production of Five English Front Vowels by College Students. English Language Teaching, 7(9), 14-20.
    Liu, Q. (2011). Factors Influencing Pronunciation Accuracy: L1 Negative Transfer, Task Variables and Individual Aptitude. English Language Teaching, 4(4), 115-120.
    Luo, B. (2016). Evaluating a computer-assisted pronunciation training (CAPT) technique for efficient classroom instruction. Computer Assisted Language Learning, 29(3), 451-476.
    Neri, A., Mich, O., Gerosa, M., & Giuliani, D. (2008). The effectiveness of computer assisted pronunciation training for foreign language learning by children. Computer Assisted Language Learning, 21(5), 393-408.
    Neri, A., Cucchiarini, C., & Strik, H. (2003, August). Automatic speech recognition for second language learning: how and why it actually works. In Proc. ICPhS (pp. 1157-1160).
    O’Brien, M. G., Derwing, T. M., Cucchiarini, C., Hardison, D. M., Mixdorff, H., Thomson, R. I., ... & Levis, G. M. (2018). Directions for the future of technology in pronunciation research and teaching. Journal of Second Language Pronunciation, 4(2), 182-207.
    Pourhosein Gilakjani, A., & Sabouri, N. B. (2017). Advantages of using computer in teaching English pronunciation. International Journal of Research in English Education, 2(3), 78-85.
    Precoda, K., & Bratt, H. (2008). Perceptual underpinnings of automatic pronunciation assessment. The path of speech technologies in computer assisted language learning, 71-84.
    Radant, H. L. H. J., & Huang, H. L. (2009). Chinese phonotactic patterns and the pronunciation difficulties of Mandarin-Speaking EFL learners. The Asian EFL Journal Quarterly December 2009 Volume 11, Issue 4, 148.
    Saito, K. (2007). The influence of explicit phonetic instruction on pronunciation in EFL settings: The case of English vowels and Japanese learners of English. Linguistics Journal, 2(3), 16-40.
    Saito, K. (2014). Experienced teachers' perspectives on priorities for improved intelligible pronunciation: The case of J apanese learners of E nglish. International Journal of Applied Linguistics, 24(2), 250-277.
    Teng, Hsin-yi. (2002). Chinese Students’ Performance in the Pronunciation of English Tense and Lax Vowels. Unpublished MA thesis, National Taiwan Normal University.
    Oliveros, J.C. (2007-2015) Venny. An interactive tool for comparing lists with Venn's diagrams. https://bioinfogp.cnb.csic.es/tools/venny/index.html
    Wang, Y. H., Young, S.S.C. (2014). A study of the design and implementation of the ASR-based iCASL system with corrective feedback to facilitate English learning. Educ. Technol. Soc. 17(2), 219–233
    Warren. J. (2012). The Effects of Automatic Speech Recognition and Text-to-speech Software on EFL Students' Pronunciation. Available from https://hdl.handle.net/11296/xbau3q
    Zielinski, B. (2019). The Segmental/ Suprasegmental Debate. The handbook of English pronunciation, 397.

    下載圖示
    QR CODE