研究生: |
游兆賢 Yu, Chao-Hsien |
---|---|
論文名稱: |
視聽落差對學生口譯員的影響:以自動語音辨識輔助之同步口譯英譯中為例 The Effects of Audio-visual Input Discrepancies on Student Interpreters in ASR-assisted Simultaneous Interpreting from English into Chinese |
指導教授: |
陳子瑋
Chen, Tze-Wei |
口試委員: |
陳子瑋
Chen, Tze-Wei 汝明麗 Ju, Ming-Li 張梵 Chang, Albert L. |
口試日期: | 2024/07/23 |
學位類別: |
碩士 Master |
系所名稱: |
翻譯研究所 Graduate Institute of Translation and Interpretation |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 英文 |
論文頁數: | 97 |
中文關鍵詞: | 自動語音辨識 、同步口譯 、錯誤 ASR 字幕 、Colavita 視覺主導效應 |
英文關鍵詞: | automatic speech recognition, simultaneous interpreting, inaccurate ASR captions, Colavita visual dominance effect |
研究方法: | 實驗設計法 、 半結構式訪談法 |
DOI URL: | http://doi.org/10.6345/NTNU202401106 |
論文種類: | 學術論文 |
相關次數: | 點閱:175 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究採用混合式研究方法,探討自動語音辨識(ASR)輔助同步口譯中,錯誤字幕對學生口譯員表現之影響。研究受試者為 14 位就讀於台灣翻譯研究所之口譯學生。每位受試者進行兩段同步口譯英譯中,過程中皆輔以 ASR 字幕,但其中一段字幕內容正確無誤,另一段則包含錯誤字幕。每段演講中設有十個檢查點,作為口譯表現評分依據,進行量性分析。每完成一段同步口譯後,隨即進行刺激回憶訪談,探討受試者譯文中因錯誤字幕而產生的誤譯,以及受試者與 ASR 字幕的互動關係。受試者訪談內容經由歸納式編碼分析,探討其因錯誤字幕而誤譯之背後原因,包含策略決策及 Colavita 視覺主導效應等因素。
研究結果顯示,錯誤字幕對學生口譯員之口譯表現產生負面影響,導致誤譯頻率增加。訪談資料則進一步揭示,受試者因錯誤字幕而誤譯之原因並非單一因素,而是策略決策及 Colavita 視覺主導效應交互作用的結果。此研究結果對口譯員及口譯教師皆具有實務價值,可作為 ASR 輔助同步口譯之應用及教學參考。
This study investigated how inaccurate captions generated by Automatic Speech Recognition (ASR) technology affect student interpreters’ performance in simultaneous interpreting (SI) from Chinese to English. A mixed-methods approach was employed, involving fourteen graduate-level interpreters. Each participant completed two SI tasks: one with accurate captions and another with mistranscribed ones. Following each task, a stimulated recall interview explored the reasons behind errors caused by inaccurate captions and how participants interacted with the ASR tool. The interpreting tasks included 10 checkpoints against which participants’ output was scored for quantitative analysis. Interview responses were then coded inductively to analyze qualitative data, focusing on potential reasons for errors such as strategic choices and the Colavita visual dominance effect.
The results indicated a negative impact of inaccurate captions on interpreting performance, leading to imported errors. Interview responses suggested a more nuanced picture: participants’ strategic choices and the Colavita visual dominance effect appeared to have a causal relationship in contributing to the errors in question, rather than being mutually independent. These findings offer valuable insights for both interpreters and instructors, informing the application and pedagogy of ASR-assisted simultaneous interpreting.
Chang, C.-C., & Schallert, D. (2007). The impact of directionality on Chinese/English simultaneous interpreting. Interpreting, 9, 137-176. https://doi.org/10.1075/intp.9.2.02cha
Cheung, A., & Li, T. (2023). Machine-aided interpreting: An experiment of automatic speech recognition in simultaneous interpreting. 104, 1-20.
Chmiel, A., Janikowski, P., & Lijewska, A. (2020). Multimodal processing in simultaneous interpreting with text. Target. International Journal of Translation Studies, 32(1), 37-58. https://doi.org/10.1075/target.18157.chm
Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16(2), 409-412. https://doi.org/10.3758/BF03203962
Collard, C., & Defrancq, B. (2018). Predictors of Ear-Voice Span, a corpus-based study with special reference to sex. Perspectives, 27, 1-24. https://doi.org/10.1080/0907676X.2018.1553199
Defrancq, B., & Fantinuoli, C. (2020). Automatic speech recognition in the booth: Assessment of system performance, interpreters' performances and interactions in the context of numbers. Target, 33. https://doi.org/10.1075/target.19166.def
Desmet, B., Vandierendonck, M., & Defrancq, B. (2018). Simultaneous interpretation of numbers and the impact of technological support. In. https://doi.org/10.5281/zenodo.1493281
Errattahi, R., El Hannani, A., & Ouahmane, H. (2018). Automatic speech recognition errors detection and correction: A review. Procedia Computer Science, 128, 32-37. https://doi.org/https://doi.org/10.1016/j.procs.2018.03.005
Fantinuoli, C. (2017). Speech Recognition in the Interpreter Workstation.
Fantinuoli, C. (2021). Conference interpreting and new technologies. In The routledge handbook of conference interpreting (pp. 508-522). https://doi.org/10.4324/9780429297878-44
Fantinuoli, C., & Montecchio, M. (2022). Defining maximum acceptable latency of AI-enhanced CAI tools.
Gile, D. (1997). Conference interpreting as a cognitive management problem. In J. Danks, G. Shreve, S. Fountain, & M. McBeath (Eds.), Cognitive processes in translation and interpreting (pp. 196-214). Sage.
Gile, D. (2009). Basic concepts and models for interpreter and translator training: Revised edition. https://doi.org/10.1075/btl.8
Gile, D. (2017). Testing the Effort Models’ tightrope hypothesis in simultaneous interpreting - A Contribution. Hermes, 23, 153-172. https://doi.org/10.7146/hjlcb.v12i23.25553
Gile, D. (2020). 2020 update of THE EFFORT MODELS and GRAVITATIONAL MODEL. https://doi.org/10.13140/RG.2.2.24895.94889
Gile, D. (2023). THE EFFORT MODELS and GRAVITATIONAL MODEL Clarifications and update. https://doi.org/10.13140/RG.2.2.20178.43209
Guo, M., Han, L., & Anacleto, M. (2022). Computer-assisted interpreting tools: Status quo and future trends. Theory and Practice in Language Studies, 13, 89-99. https://doi.org/10.17507/tpls.1301.11
Lamberger-Felber, H. (2001). Text-oriented research into interpreting - Examples from a case-study. HERMES - Journal of Language and Communication in Business, 14(26), 39-64. https://doi.org/10.7146/hjlcb.v14i26.25638
Lambert, S. (2004). Shared attention during sight translation, sight interpretation and simultaneous interpretation. Meta, 49(2), 294-306. https://doi.org/10.7202/009352ar
Liu, M., Schallert, D., & Carroll, P. (2004). Working memory and expertise in simultaneous interpreting. Interpreting, 6, 19-42. https://doi.org/10.1075/intp.6.1.04liu
Loomans, N. D. P. (2021). Error analysis in automatic speech recognition and machine translation [Master's thesis, Universidade de Lisboa].
Ma, X., & Cheung, A. (2020). Language interference in English-Chinese simultaneous interpreting with and without text. Babel. Revue internationale de la traduction / International Journal of Translation, 66, 434-456. https://doi.org/10.1075/babel.00168.che
Meuleman, C., & Van Besien, F. (2009). Coping with extreme speech conditions in simultaneous interpreting. Interpreting, 11, 20-34. https://doi.org/10.1075/intp.11.1.03meu
Moser-Mercer, B. (2000). Simultaneous interpreting: Cognitive potential and limitations. Interpreting, 5, 83-94. https://doi.org/10.1075/intp.5.2.03mos
Pisani, E., & Fantinuoli, C. (2021). Measuring the impact of automatic speech recognition on number rendition in simultaneous interpreting. In (pp. 181-197). https://doi.org/10.4324/9781003017400-14
Prandi, B. (2023). Computer-assisted simultaneous interpreting: A cognitive-experimental study on terminology. Language Science Press.
Schmid, C., Büchel, C., & Rose, M. (2011). The neural basis of visual dominance in the context of audio-visual object processing. NeuroImage, 55, 304-311. https://doi.org/10.1016/j.neuroimage.2010.11.051
Seeber, K. (2011). Cognitive load in simultaneous interpreting: Existing theories - New models. Interpreting, 13, 176-204. https://doi.org/10.1075/intp.13.2.02see
Seeber, K. (2017). Multimodal processing in simultaneous interpreting. In (pp. 461-475). https://doi.org/10.1002/9781119241485.ch25
Seeber, K. G., Keller, L., & Hervais-Adelman, A. (2020). When the ear leads the eye–the use of text during simultaneous interpretation. Language, Cognition and Neuroscience, 35(10), 1480-1494.
Soergel, D. (2005). Typology of errors in ASR transcription of oral history interviews. https://doi.org/10.13140/RG.2.1.3674.4562
Wallinheimo, A. S., Evans, S. L., & Davitti, E. (2023). Training in new forms of human-AI interaction improves complex working memory and switching skills of language professionals. Frontiers in artificial intelligence, 6, 1253940. https://doi.org/10.3389/frai.2023.1253940
Yuan, L., & Wang, B. (2023). Cognitive processing of the extra visual layer of live captioning in simultaneous interpreting. Triangulation of eye-tracked process and performance data. Ampersand, 11. https://doi.org/10.1016/j.amper.2023.100131