簡易檢索 / 詳目顯示

研究生: 陳國泰
Kuotai Chen
論文名稱: 語音辨識軟體回饋對發音學習成效之影響研究
An Investigation on the Impact of ASR Software Feedback on EFL College Students' Pronunciation Learning
指導教授: 陳浩然
Chen, Hao-Jan
學位類別: 碩士
Master
系所名稱: 英語學系
Department of English
論文出版年: 2005
畢業學年度: 93
語文別: 英文
論文頁數: 104
中文關鍵詞: 發音語音辨識軟體發音學習語音辨識軟體回饋語調學習策略
英文關鍵詞: pronunciation, pronunciation learning, ASR software, ASR software feedback, intonation, learning strategies
論文種類: 學術論文
相關次數: 點閱:286下載:40
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究旨在了解語音辨識軟體之回饋對大學生學習發音的效用。研究中邀請了二十三位來自台灣師範大學大一英文課的學生參與本研究中的實驗組,另有二十位來自同樣大一英文課的學生作為控制組。兩組學生皆需接受有二十句包含易混淆母音句子的前測,並將學生朗讀的內容錄音。接著實驗組的學生必須接受為期五週的發音練習,期間使用軟體MyET。控制組的學生在這五週期間則不需參與任何發音訓練。本研究並採用了回饋單、問卷和訪談。此外,學生使用軟體作發音練習的過程亦由螢幕擷取軟體Camtasia Studio所記錄,學生練習的語音部分亦會同時被錄製,以作為觀察用。整個發音訓練結束後,實驗組和控制組學生皆需參加後測和全新的歸納測驗(Generalization test)。最後所有的錄音檔案交由三名評分老師評分,以了解兩組學生在測驗中的表現是否有差異。
    研究的結果顯示實驗組的學生經過五週的發音練習後,在後測和歸納測驗中,在單音的表現和音調的部分有顯著的進步。在評估不同類型的回饋中則發現隨著練習時間的增加,學生對各種回饋的信心也有變化。在研究結束前,學生認為發音回饋、發音診斷、分數、語調回饋和語調波紋圖對發音學習最為有效,而聲譜圖、節拍回饋和音量回饋則被評鑑為無效的回饋。因此,從研究結果中得知學生確能從一些特定的回饋中改善他們的發音或語調。更重要的是,學習者在練習的過程中察覺到自己的發音問題,並發展出一些策略來改進。
    最後對於未來的相關研究提出一些建議。首先,發音練習開始前應教導學習者如何解讀不同的回饋類型。此外,發音練習所使用的句子應該要儘可能地簡單,才不致使學習者產生閱讀障礙。至於往後相關研究則建議採用更多的受試者,較長的訓練時間和在各個測驗中使用一樣數量的句子。另外,也可增加後續測驗來檢測訓練成果持續的時間。而若是能再研究中使用另一個有類似回饋的語音軟體來做比對,那麼各種不同類型的回饋效用則能有更肯定的佐證。

    The current study aims to investigate the impact of ASR software feedback on EFL college students’ pronunciation learning. Twenty-three students from Freshman English classes in NTNU attended the study as the experimental group and twenty as the control group. Both groups took a pre-test containing twenty assigned sentences with confusing vowels for Chinese speakers and their speech productions were simultaneously recorded. Following the pre-test, the participants of the experimental group received a five-week pronunciation training with the software MyET, whereas the control group did not. The participants of the experimental group also had to complete feedback forms, questionnaires and interviews in the study. Besides, training observations were completed by using screen recording software, Camtasia Studio. After the training session, the participants of both the experimental group and the control group took a post-test and a generalization test. Three raters then assessed the collected recordings and gave a segmental score and a prosodic score to each of them.
    The findings of the study revealed the five-week pronunciation training session successfully resulted in the experimental group’s significant improvement in segmental accuracy and prosodic productions in the post-test and in the generalization test. In evaluating the different types of feedback, practice time was found to have affected the participants’ confidence in the feedback types. At the end of the study, the participants considered feedback on pronunciation, pronunciation diagnosis, scores, feedback on intonation, and pitch contours more effective in pronunciation learning than spectrograms, feedback on rhythm and feedback on volume. Learners, therefore, benefited from some of these feedback types in improving either their segmental accuracy or prosodic productions. Most important of all, learners could perceive their pronunciation errors and had cultivated certain strategies to correct these errors after the training.
    It is then suggested that extra instructions on how to interpret the feedback be given to learners before the pronunciation training starts. Besides, the pronunciation task should be as simple as possible to ensure that learners have no difficulty reading the sentences. For future studies, a larger size of participants, longer training session, and equal number of sentences with proximate sounds in the pre-test, post-test, and the generalization test are recommended. A follow-up study to examine the duration of the training effect is also practical. Moreover, another ASR software with similar design may be employed in the same study in order to verify whether feedback of the same types indeed work for language learners even when different software is used.

    TABLE OF CONTENTS CHAPTER ONE INTRODUCTION.............................. 1 Background and Motivation............................. 1 Purposes of the Study................................. 2 Significance of the Study............................. 4 Definition of Terms................................... 5 Abbreviation of Terms................................. 6 CHAPTER TWO LITERATURE REVIEW............................ 8 The Significance of Pronunciation Instruction and Acquisition........................................... 8 Teaching and Learning of Pronunciation ............... 9 Foreign Accent, Comprehensibility and Intelligibility ..........................................11 Focus on Suprasegmental Features ..................... 13 ASR Software and Pronunciation Learning .............. 15 The Pedagogical Design of ASR Software ............... 17 ASR Software Feedback to Language Learners ........... 18 The Effectiveness of ASR Software Feedback in Pronunciation Learning .................................................19 Learners’ Perception and Incorporation of ASR Software Feedback ............................................. 21 Learners’ Generalization to Improvement in New Speech Production .......................................... 23 Summary .............................................. 24 Research Questions ................................... 25 CHAPTER THREE METHOD ................................. 26 Participants ......................................... 26 Procedure of the Study ............................... 27 Pilot Study ............................................ 28 Pre-test and Post-test: Read-aloud Task .............. 30 ASR Software Training Session ........................ 31 Generalization Test .................................... 32 Instruments .......................................... 32 The Reading Tasks in the Pre-test, Post-test, and the Generalization Test .................................. 33 ASR Software: MyET (My English Teacher) .............. 34 Feedback Forms ....................................... 38 Training Observations ............................... 39 Questionnaires ...................................... 39 Interviews .......................................... 40 Raters and the Scoring .............................. 41 Data Analysis ........................................ 42 The Scoring of Pre-test, Post-test, and the Generalization Test ................................................... 43 The Feedback Forms and the Screen Recordings ......... 44 The Participants’ Responses to Questionnaires and Interviews ...............................................44 Summary of the Method ................................ 45 CHAPTER FOUR RESULTS ................................. 46 Results of the Pre-test, Post-test and the Generalization Test ................................................. 46 The Participants’ Mean Segmental Scores. ............... 46 The Participants’ Mean Prosodic Scores ................. 49 The Participants’ Responses to Feedback Forms and the Questionnaires ....................................... 52 Participants’ Perception Concerning Pronunciation Learning ................................................ 52 Participants’ Evaluation of Different Feedback Types.....55 Participants’ General Comments on the Training with MyET .....................................................58 Interviews ........................................... 59 Training Observations .................................. 63 Recordings of the First-Week Training ................ 64 Recordings of the Last-Week Training ................. 69 Summary of the Results ............................... 70 CHAPTER FIVE DISCUSSION .............................. 72 The Participants’ Segmental Improvement between the Pre-test and the Post-test .................................. 72 The Participants’ Segmental Improvement between the Pre-test and the Generalization Test......................... 74 The Participants’ Prosodic Improvement between the Pre-test and the Post-test ...................................75 Comparison of the Participants’ Prosodic Improvement between the Pre-test and the Generalization Test......... 77 The Participants’ Perception and Interpretation of the Different Types of ASR Software Feedback................. 78 The Participants’ Incorporation of the Different Types of ASR Software Feedback into Pronunciation Learning ....... 82 Summary of Discussion ................................... 86 CHAPTER SIX CONCLUSION ................................ 87 Summary of Findings ..................................... 87 Pedagogical Implications ............................. 88 Limitations of the Study and Suggestions for Future Research ................................................ 90 REFERENCES ........................................... 92 Appendix A Sentences in the Pre-test, Post-test and the Generalization test ...................... 98 Appendix B Feedback Forms ........................ 100 Appendix C Questionnaire .......................... 101 Appendix D Interview Questions .................... 103 Appendix E The Scoring Rubrics of The Pre-test, Post- test and the Generalization Test ...... 104 LIST OF TABLES Table 1 Correlation of the Raters’ Segmental Scores... 42 Table 2 Correlation of the Raters’ Prosodic Scores ....42 Table 3 Mean Scores of the Participants’ Segmental Production ................................. 47 Table 4 Paired-Samples T-test Results of the Participants’Segmental Scores ............. 48 Table 5 Independent-Samples T-test Results of the Participants’ Segmental Scores ............ 49 Table 6 Mean Scores of the Participants’ Prosodic Production ................................. 50 Table 7 Paired-Samples T-test Resultss of the Participants’ Prosodic Production .............51 Table 8 Independent-Samples T-test Results of the Participants’ Prosodic Scores ............. 51 Table 9 The Participants’ Difficulty in Speaking English .................................... 52 Table 10 The Participants’ Methods of Practicing Speaking English .....................................53 Table 11 The Participants’ Rankings of Abilities Concerning English Speaking ................ 54 Table 12 Chi-Square Result of the Main Concern of Speaking Ability .................................... 54 Table 13 The Participant’s First-Time Evaluation of Different Feedback Types ................... 55 Table 14 The Participant’s Second-Time Evaluation of Different Feedback Types.................... 56 Table 15 Chi-Square Result of the Participant’s First- Time Evaluation of Different Feedback Types. 57 Table 16 Chi-Square Result of the Participant’s Second- Time Evaluation of Different Feedback Types. 57 Table 17 The Participants’ General Comments on the Training Process............................ 58 LIST OF FIGURES Figure 1 Feedback Display in the Software MyET...... 5 Figure 2 Flowchart of the Procedure................. 28 Figure 3 The Scores and Spectrographic Display ..... 35 Figure 4 Feedback on Pronunciation and Pronunciation Diagnosis.................................. 36 Figure 5 Feedback on Intonation .................... 36 Figure 6 Feedback on Rhythm ........................ 37 Figure 7 Feedback on Volume ........................ 37 Figure 8 Learning Records .......................... 38 Figure 9 Feedback on Pronunciation and Pronunciation Diagnosis ..................................65 Figure 10 Deterioration in the Rhythm Score ......... 66 Figure 11 Feedback on Intonation and the Pitch Contour .......................................... 66 Figure 12 Feedback on Rhythm ........................ 68 Figure 13 Feedback on Volume ........................ 68 Figure 14 The Mean Segmental Scores ................. 72 Figure 15 The Mean Prosodic Scores .................. 77 Figure 16 Animated Demonstration of Pronunciation Diagnosis ..................................83

    Anderson, J. I. (1987). The markedness differential hypothesis and syllable structure differences. In G. Ioup & S. Weinerberger (Eds.), Interlanguage Phonology (pp.279-291). New York: Newbury House.
    Anderson-Hsieh, J. (1992). Using electronic visual feedback to teach suprasegmentals. System, 20(1), 51-62.
    Bax, S. (2003). Call—past, present and future. System, 31(1), 13-28.
    Beebe, I. M. (1984). Myths about interlanguage phonology. In S. Eliasson (Ed.), Theoretical Issues in Contrastive Phonology (pp. 51-61). Heidelberg: Julius Groos Verlag.
    Bernstein, L. E., & Benoit, C. (1998). For speech perception by humans or machines, three senses are better than one. Proceedings of International Conference on Spoken Language Processing, 1477-1480.
    Camtasia Studio [Computer software]. (2004). Okemos, MI: TechSmith Corporation.
    Celce-Murcia, M., Brinton, D., & Goodwin, J. (1996). Teaching Pronunciation: A Reference for Teachers of English to Speakers of Other languages. Cambridge: Cambridge University Press.
    Chapelle, C. (2001). Computer Applications in Second language Acquisition. Cambridge: Cambridge University press.
    Chen, H. J. (2001). Evaluation five speech recognition programs for ESL learners. ITMELT 2001 Conference.
    Chun, D. M. (1998). Signal analysis software for teaching discourse intonation. Langauge learning & Technology, 2(1), 61-77.
    Coniam, D. (1999). Voice recognition software accuracy with second language speakers of English. System, 27(1), 49-64.
    Crystal, D. (1981). Clinical linguistics. New York: Haper Press.
    de Bot, K. (1983). Visual feedback of intonation I: Effectiveness and induced practice behavior. Language and Speech, 26(4), 331-350.
    de Bot, K.; & Mailfert, K. (1982). The teaching of intonation: Fundamental research and classroom applications. TESOL Quarterly, 16(1), 71-77.
    Derwing, T. M.; Munro, M. J., & Carbonaro, M. (2000). Does popular speech recognition software work with ESL Speech? TESOL Quarterly, 34, 592-603.
    Derwing, T. M., Munro, M. J., & Wiebe, G. (1998). Evidence in favor of a broad framework for pronunciation instruction. Language Learning, 48(3), 393-410.
    Derwing, T. M., & Munro, M. J. (1997). Accent, intelligibility, and comprehensibility. Second Language Acquisition, 19(1), 1-16.
    Eckman, F. R. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27(2), 315-330.
    Ehsani, F.; & Knodt, Eva. (1998). Speech technology in computer-aided language learning: strengths and limitations of a new CALL paradigm. Language Learning & Technology, 2(1), 45-60.
    Eskenazi, M. (1999). Using automatic speech processing for foreign language pronunciation tutoring: some issues and a prototype. Language Learning & Technology, 2(2), 62-76.
    Franco, H.; Neumeyer, L.; Digalkis, V., & Ronen, O. (2000). Combination of machine scores for automatic grading of pronunciation quality. Speech Communication, 30, 121-130.
    Fromkin, V., & Rodman, R. (1993). An Introduction to Language (5th ed.). Orlando: Harcourt Brace & Company.
    Gandour, J. T. (1983). Tone perception in far eastern languages, Journal of Phonetics, 11, 149-175.
    Gass, S. & Varonis, E. M. (1984). The effect of familiarity on the comprehensibility of nonnative speech. Language Learning, 34(1), 65-89.
    Hardison, D. M. (2004). Generalization of computer-assisted prosody training: quantitative and qualitative findings. Language Learning & Technology, 8(1), 34-52.
    Hardison, D. M. (2003). Acquisition of second-language speech: effects of visual cues, context, and talker variability. Applied Psycholinguistics, 24(4), 495-522.
    Hatch, E., & Lazaraton, A. (1991). The research manual: Design and statistics for applied linguistics. New York: Newbury House.
    Hincks, R. (2003). Speech technologies for pronunciation feedback and evaluation. ReCall, 15(1), 3-20.
    Jone, R. H. (1997). Beyond “listen and repeat”: pronunciation teaching materials and theories of second language acquisition. System, 25(1), 103-112.
    Juffs. A. (1990). Tone, syllable structure and interlanguage phonology: Chinese learners’ stress errors. IRAL, 28(2), 99-117.
    Kenworthy, J. (1992). Teaching English Pronunciation. New York: Longman Publishing.
    Krashen, S. D. (1981). Second Language Acquisition and Second Language Learning. Oxford: Pergamon Press.
    Krashen, S. D. (1982). Principles and Practice in Second Language Acquisition. Pergamon, Fairview Park.
    Lado, R. (1957). Linguistics Across Cultures. Ann Arbor: University of Michigan Press.
    Larsen-Freeman, D., & Long, M. H. (1991). An Introduction to Second Language Acquisition Research. New York: Longman.
    Lehiste, I. (1996). Suprasegmental features of speech. In N. J. Lass (Ed.), Principles of Experimental Phonetics (pp. 226-244). St. Louis, Missouri: Mosby-Year Book.
    Liontas, J. (2002). CALLMedia digital technology: Whither in the new millennium. CALICO Journal, 19(2), 315-330.
    Lively, S. E., Logan, J. S., & Pisoni, D. B. (1993). Training Japanese listeners to identify English /r/ and /l/. II: The role of phonetic environment and talker variability in learning new perceptual categories. Journal of the Acoustical Society of America, 94, 1242-1255.
    Luthy, M. J. (1983). Nonnative speakers’ perceptions of English “nonlexical” intonation signals. Language Learning, 33(1), 19-36.
    Lyster, R. (1998). Negotiation of form, recasts, and explicit correction in relation to error types and learner repair in immersion classrooms. Language Learning, 48(2), 183-218.
    Ma, L. (1994). English Learning: An Analysis of Chinese Students’ Problems in Pronunciation.
    Molholt, G. (1988). Computer-assisted instruction in pronunciation for Chinese speakers of American English. TESOL Quarterly, 22(1), 91-111.
    Molholt, G. (1990). Spectrographic analysis and patterns in pronunciation. Computers and the Humanities, 24, 81-92.
    Morley, J. (1991). The pronunciation component in teaching English to speakers of other languages. TESOL Quarterly, 25(3), 481-520.
    Munro, M. J., & Derwing, T. M. (1995). Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning, 45(1), 73-97.
    Murray, L., & Barnes, A. (1998). Beyond the “wow” factor—evaluating multimedia language learning software from a pedagogical viewpoint. System, 26(2), 249-259.
    My English Tutor. [Computer software]. (2002). Taipei: L Labs Incorporation.
    Neri, A., Cucchiarini, C., Strick, H, & Boves, L. (2002). The pedagogy-technology interface in computer assisted pronunciation training. Computer Assisted Language Learning, 15(5), 441-467.
    Neri, A., Cucchiarini, C., & Strik, W. (2003). Automatic speech recognition for second language learning: How and why it actually works. 15th ICPhs Barcelona.
    Nicholas, H., Lightbown, P. M., & Spada, N. (2001). Recasts as feedbacks to language learners. Language Learning, 51(4), 719-758.
    Pennington, M. C. (1996). Phonology in English Language Teaching: An International Approach. London: Longman
    Pennington, M. C., & Richards, J. C. (1986). Pronunciation revisited. TESOL Quarterly, 22(1), 91-111.
    Precoda, K., Halverson, C., & Franco, C. (2000). Effects of speech recognition-based pronunciation feedback on second-language pronunciation ability. InSTIL2000. First Symposium on Integrating Speech Technology in (Language) Learning. Dundee, Scotland.
    Reeser, T. (2002). “Tell Me More French,” Software review, CALICO Journal, 19, 419-428.
    Shen, X. (1990). Ability of learning the prosody of an intonational language by speakers of tonal language: Chinese speakers learning French prosody. IRAL, 28(2), 119-134.
    Stockman, I. J., & Pluut, E. (1992). Segment composition as a factor in the syllabification errors of second-language speakers. Language Learning, 42(1), 21-45.
    Tiee, H. (1969). Contrastive analysis of monosyllanic structure of American English and Mandarin Chinese. Language Learning, 19(1&2), 1-15.
    Varonis, E., & Gass, S. (1982). The comprehensibility of non-native speech. Studies in Second Language Acquisition, 4(2), 114-136.
    Waters, R. (1994). The audio interactive tutor (Tech. Rep. No. 94-04). Cambridge Research Center, MA: Mitsubishi Electronic Research Laboratories.
    Weeren, J. V., & Theunissen, T. J. (1987). Testing pronunciation: An application of generalizability theory. Language Learning, 37(1), 109-122.
    Wennerstrom, A. (1998). Intonation as cohesion in academic discourse: A study of Chinese speakers of English. Studies in Second Language Acquisition, 20(1), 1-25.
    Weltens, B., & de Bot, K. (1984). Visual feedback of intonation II: feedback delay and quality of feedback. Language and Speech, 26(4), 331-350.
    Zheng, T. (2002). “Tell Me More Chinese,” http://www.calico.org/CALICO_Review/
    review/tmm-chinese00.htm

    QR CODE