國立臺灣師範大學博碩士論文全文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	游兆賢 Yu, Chao-Hsien
論文名稱：	視聽落差對學生口譯員的影響：以自動語音辨識輔助之同步口譯英譯中為例 The Effects of Audio-visual Input Discrepancies on Student Interpreters in ASR-assisted Simultaneous Interpreting from English into Chinese
指導教授：	陳子瑋 Chen, Tze-Wei
口試委員：	陳子瑋 Chen, Tze-Wei 汝明麗 Ju, Ming-Li 張梵 Chang, Albert L.
口試日期：	2024/07/23
學位類別：	碩士 Master
系所名稱：	翻譯研究所 Graduate Institute of Translation and Interpretation
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	97
中文關鍵詞：	自動語音辨識、同步口譯、錯誤 ASR 字幕、Colavita 視覺主導效應
英文關鍵詞：	automatic speech recognition, simultaneous interpreting, inaccurate ASR captions, Colavita visual dominance effect
研究方法:	實驗設計法、半結構式訪談法
DOI URL：	http://doi.org/10.6345/NTNU202401106
論文種類：	學術論文
相關次數：	點閱：177 下載：5
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究採用混合式研究方法，探討自動語音辨識（ASR）輔助同步口譯中，錯誤字幕對學生口譯員表現之影響。研究受試者為 14 位就讀於台灣翻譯研究所之口譯學生。每位受試者進行兩段同步口譯英譯中，過程中皆輔以 ASR 字幕，但其中一段字幕內容正確無誤，另一段則包含錯誤字幕。每段演講中設有十個檢查點，作為口譯表現評分依據，進行量性分析。每完成一段同步口譯後，隨即進行刺激回憶訪談，探討受試者譯文中因錯誤字幕而產生的誤譯，以及受試者與 ASR 字幕的互動關係。受試者訪談內容經由歸納式編碼分析，探討其因錯誤字幕而誤譯之背後原因，包含策略決策及 Colavita 視覺主導效應等因素。
研究結果顯示，錯誤字幕對學生口譯員之口譯表現產生負面影響，導致誤譯頻率增加。訪談資料則進一步揭示，受試者因錯誤字幕而誤譯之原因並非單一因素，而是策略決策及 Colavita 視覺主導效應交互作用的結果。此研究結果對口譯員及口譯教師皆具有實務價值，可作為 ASR 輔助同步口譯之應用及教學參考。

This study investigated how inaccurate captions generated by Automatic Speech Recognition (ASR) technology affect student interpreters’ performance in simultaneous interpreting (SI) from Chinese to English. A mixed-methods approach was employed, involving fourteen graduate-level interpreters. Each participant completed two SI tasks: one with accurate captions and another with mistranscribed ones. Following each task, a stimulated recall interview explored the reasons behind errors caused by inaccurate captions and how participants interacted with the ASR tool. The interpreting tasks included 10 checkpoints against which participants’ output was scored for quantitative analysis. Interview responses were then coded inductively to analyze qualitative data, focusing on potential reasons for errors such as strategic choices and the Colavita visual dominance effect.
The results indicated a negative impact of inaccurate captions on interpreting performance, leading to imported errors. Interview responses suggested a more nuanced picture: participants’ strategic choices and the Colavita visual dominance effect appeared to have a causal relationship in contributing to the errors in question, rather than being mutually independent. These findings offer valuable insights for both interpreters and instructors, informing the application and pedagogy of ASR-assisted simultaneous interpreting.

誌謝	i
摘要	ii
Abstract	iii
Chapter 1 Introduction	1
Chapter 2 Literature Review	6
1 Simultaneous Interpreting	6
1.1 Simultaneous Interpreting as a Cognitive Task	6
1.2 Problem triggers	9
2 Multi-modal Processing	11
2.1 Benefits and Drawbacks	12
3 Computer-assisted Interpreting (CAI)	13
3.1 Automatic Speech Recognition in SI	14
3.2 Audio-visual Discrepancies	17
Chapter 3 Methods	21
1 Participants	21
2 Stimuli	22
2.1 Designed Mistranscriptions	24
2.2 Software	26
3 Pilot Study	27
4 Procedure and Data Collection	28
4.1 Experiment Setup & Instruction	28
4.2 Interpreting Tasks	29
4.3 Stimulated Recall Interview	29
5 Data Analysis	32
Chapter 4 Results	34
1 Quantitative Analysis	34
1.1 ASR-generated Captions	35
1.2 Scoring Criteria	36
1.4 Inter-group and Intra-group Analyses	43
2 Qualitative Analysis	47
2.1 Interview Responses	47
2.2 Advantages and Disadvantages of ASR-assisted SI	48
2.3 Interference of Audiovisual Discrepancies	57
2.4 Reasons behind Imported Errors in ASR-assisted SI	61
3 Summary	72
Chapter 5 Discussion and Conclusion	73
1 The Causality Between Strategic Choices and the Visual Dominance Effect	74
2 The Effectiveness of ASR-captions as a Visual Aid in SI	76
3 Findings	79
4 Limitations	80
5 Future Research Directions and Contribution	81
References	84
Appendix A: Speech Transcript	89
Appendix B: Speech Summary & Glossary	94
Appendix C: Experiment and Interview Guide	96
Appendix D: Informed Consent Form	97
                                

Chang, C.-C., & Schallert, D. (2007). The impact of directionality on Chinese/English simultaneous interpreting. Interpreting, 9, 137-176. https://doi.org/10.1075/intp.9.2.02cha
Cheung, A., & Li, T. (2023). Machine-aided interpreting: An experiment of automatic speech recognition in simultaneous interpreting. 104, 1-20.
Chmiel, A., Janikowski, P., & Lijewska, A. (2020). Multimodal processing in simultaneous interpreting with text. Target. International Journal of Translation Studies, 32(1), 37-58. https://doi.org/10.1075/target.18157.chm
Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16(2), 409-412. https://doi.org/10.3758/BF03203962
Collard, C., & Defrancq, B. (2018). Predictors of Ear-Voice Span, a corpus-based study with special reference to sex. Perspectives, 27, 1-24. https://doi.org/10.1080/0907676X.2018.1553199
Defrancq, B., & Fantinuoli, C. (2020). Automatic speech recognition in the booth: Assessment of system performance, interpreters' performances and interactions in the context of numbers. Target, 33. https://doi.org/10.1075/target.19166.def
Desmet, B., Vandierendonck, M., & Defrancq, B. (2018). Simultaneous interpretation of numbers and the impact of technological support. In. https://doi.org/10.5281/zenodo.1493281
Errattahi, R., El Hannani, A., & Ouahmane, H. (2018). Automatic speech recognition errors detection and correction: A review. Procedia Computer Science, 128, 32-37. https://doi.org/https://doi.org/10.1016/j.procs.2018.03.005
Fantinuoli, C. (2017). Speech Recognition in the Interpreter Workstation.
Fantinuoli, C. (2021). Conference interpreting and new technologies. In The routledge handbook of conference interpreting (pp. 508-522). https://doi.org/10.4324/9780429297878-44
Fantinuoli, C., & Montecchio, M. (2022). Defining maximum acceptable latency of AI-enhanced CAI tools.
Gile, D. (1997). Conference interpreting as a cognitive management problem. In J. Danks, G. Shreve, S. Fountain, & M. McBeath (Eds.), Cognitive processes in translation and interpreting (pp. 196-214). Sage.
Gile, D. (2009). Basic concepts and models for interpreter and translator training: Revised edition. https://doi.org/10.1075/btl.8
Gile, D. (2017). Testing the Effort Models’ tightrope hypothesis in simultaneous interpreting - A Contribution. Hermes, 23, 153-172. https://doi.org/10.7146/hjlcb.v12i23.25553
Gile, D. (2020). 2020 update of THE EFFORT MODELS and GRAVITATIONAL MODEL. https://doi.org/10.13140/RG.2.2.24895.94889
Gile, D. (2023). THE EFFORT MODELS and GRAVITATIONAL MODEL Clarifications and update. https://doi.org/10.13140/RG.2.2.20178.43209
Guo, M., Han, L., & Anacleto, M. (2022). Computer-assisted interpreting tools: Status quo and future trends. Theory and Practice in Language Studies, 13, 89-99. https://doi.org/10.17507/tpls.1301.11
Lamberger-Felber, H. (2001). Text-oriented research into interpreting - Examples from a case-study. HERMES - Journal of Language and Communication in Business, 14(26), 39-64. https://doi.org/10.7146/hjlcb.v14i26.25638
Lambert, S. (2004). Shared attention during sight translation, sight interpretation and simultaneous interpretation. Meta, 49(2), 294-306. https://doi.org/10.7202/009352ar
Liu, M., Schallert, D., & Carroll, P. (2004). Working memory and expertise in simultaneous interpreting. Interpreting, 6, 19-42. https://doi.org/10.1075/intp.6.1.04liu
Loomans, N. D. P. (2021). Error analysis in automatic speech recognition and machine translation [Master's thesis, Universidade de Lisboa].
Ma, X., & Cheung, A. (2020). Language interference in English-Chinese simultaneous interpreting with and without text. Babel. Revue internationale de la traduction / International Journal of Translation, 66, 434-456. https://doi.org/10.1075/babel.00168.che
Meuleman, C., & Van Besien, F. (2009). Coping with extreme speech conditions in simultaneous interpreting. Interpreting, 11, 20-34. https://doi.org/10.1075/intp.11.1.03meu
Moser-Mercer, B. (2000). Simultaneous interpreting: Cognitive potential and limitations. Interpreting, 5, 83-94. https://doi.org/10.1075/intp.5.2.03mos
Pisani, E., & Fantinuoli, C. (2021). Measuring the impact of automatic speech recognition on number rendition in simultaneous interpreting. In (pp. 181-197). https://doi.org/10.4324/9781003017400-14
Prandi, B. (2023). Computer-assisted simultaneous interpreting: A cognitive-experimental study on terminology. Language Science Press.
Schmid, C., Büchel, C., & Rose, M. (2011). The neural basis of visual dominance in the context of audio-visual object processing. NeuroImage, 55, 304-311. https://doi.org/10.1016/j.neuroimage.2010.11.051
Seeber, K. (2011). Cognitive load in simultaneous interpreting: Existing theories - New models. Interpreting, 13, 176-204. https://doi.org/10.1075/intp.13.2.02see
Seeber, K. (2017). Multimodal processing in simultaneous interpreting. In (pp. 461-475). https://doi.org/10.1002/9781119241485.ch25
Seeber, K. G., Keller, L., & Hervais-Adelman, A. (2020). When the ear leads the eye–the use of text during simultaneous interpretation. Language, Cognition and Neuroscience, 35(10), 1480-1494.
Soergel, D. (2005). Typology of errors in ASR transcription of oral history interviews. https://doi.org/10.13140/RG.2.1.3674.4562
Wallinheimo, A. S., Evans, S. L., & Davitti, E. (2023). Training in new forms of human-AI interaction improves complex working memory and switching skills of language professionals. Frontiers in artificial intelligence, 6, 1253940. https://doi.org/10.3389/frai.2023.1253940
Yuan, L., & Wang, B. (2023). Cognitive processing of the extra visual layer of live captioning in simultaneous interpreting. Triangulation of eye-tracked process and performance data. Ampersand, 11. https://doi.org/10.1016/j.amper.2023.100131

簡易檢索 / 詳目顯示

相關論文