簡易檢索 / 詳目顯示

研究生: 游兆賢
Yu, Chao-Hsien
論文名稱: 視聽落差對學生口譯員的影響:以自動語音辨識輔助之同步口譯英譯中為例
The Effects of Audio-visual Input Discrepancies on Student Interpreters in ASR-assisted Simultaneous Interpreting from English into Chinese
指導教授: 陳子瑋
Chen, Tze-Wei
口試委員: 陳子瑋
Chen, Tze-Wei
汝明麗
Ju, Ming-Li
張梵
Chang, Albert L.
口試日期: 2024/07/23
學位類別: 碩士
Master
系所名稱: 翻譯研究所
Graduate Institute of Translation and Interpretation
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 97
中文關鍵詞: 自動語音辨識同步口譯錯誤 ASR 字幕Colavita 視覺主導效應
英文關鍵詞: automatic speech recognition, simultaneous interpreting, inaccurate ASR captions, Colavita visual dominance effect
研究方法: 實驗設計法半結構式訪談法
DOI URL: http://doi.org/10.6345/NTNU202401106
論文種類: 學術論文
相關次數: 點閱:108下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究採用混合式研究方法,探討自動語音辨識(ASR)輔助同步口譯中,錯誤字幕對學生口譯員表現之影響。研究受試者為 14 位就讀於台灣翻譯研究所之口譯學生。每位受試者進行兩段同步口譯英譯中,過程中皆輔以 ASR 字幕,但其中一段字幕內容正確無誤,另一段則包含錯誤字幕。每段演講中設有十個檢查點,作為口譯表現評分依據,進行量性分析。每完成一段同步口譯後,隨即進行刺激回憶訪談,探討受試者譯文中因錯誤字幕而產生的誤譯,以及受試者與 ASR 字幕的互動關係。受試者訪談內容經由歸納式編碼分析,探討其因錯誤字幕而誤譯之背後原因,包含策略決策及 Colavita 視覺主導效應等因素。
    研究結果顯示,錯誤字幕對學生口譯員之口譯表現產生負面影響,導致誤譯頻率增加。訪談資料則進一步揭示,受試者因錯誤字幕而誤譯之原因並非單一因素,而是策略決策及 Colavita 視覺主導效應交互作用的結果。此研究結果對口譯員及口譯教師皆具有實務價值,可作為 ASR 輔助同步口譯之應用及教學參考。

    This study investigated how inaccurate captions generated by Automatic Speech Recognition (ASR) technology affect student interpreters’ performance in simultaneous interpreting (SI) from Chinese to English. A mixed-methods approach was employed, involving fourteen graduate-level interpreters. Each participant completed two SI tasks: one with accurate captions and another with mistranscribed ones. Following each task, a stimulated recall interview explored the reasons behind errors caused by inaccurate captions and how participants interacted with the ASR tool. The interpreting tasks included 10 checkpoints against which participants’ output was scored for quantitative analysis. Interview responses were then coded inductively to analyze qualitative data, focusing on potential reasons for errors such as strategic choices and the Colavita visual dominance effect.
    The results indicated a negative impact of inaccurate captions on interpreting performance, leading to imported errors. Interview responses suggested a more nuanced picture: participants’ strategic choices and the Colavita visual dominance effect appeared to have a causal relationship in contributing to the errors in question, rather than being mutually independent. These findings offer valuable insights for both interpreters and instructors, informing the application and pedagogy of ASR-assisted simultaneous interpreting.

    誌謝 i 摘要 ii Abstract iii Chapter 1 Introduction 1 Chapter 2 Literature Review 6 2.1 Simultaneous Interpreting 6 2.1.1 Simultaneous Interpreting as a Cognitive Task 6 2.1.2 Problem triggers 9 2.2 Multi-modal Processing 11 2.2.1 Benefits and Drawbacks 12 2.3 Computer-assisted Interpreting (CAI) 13 2.3.1 Automatic Speech Recognition in SI 14 2.3.2 Audio-visual Discrepancies 17 Chapter 3 Methods 21 3.1 Participants 21 3.2 Stimuli 22 3.2.1 Designed Mistranscriptions 24 3.2.2 Software 26 3.3 Pilot Study 27 3.4 Procedure and Data Collection 28 3.4.1 Experiment Setup & Instruction 28 3.4.2 Interpreting Tasks 29 3.4.3 Stimulated Recall Interview 29 3.5 Data Analysis 32 Chapter 4 Results 34 4.1 Quantitative Analysis 34 4.1.1 ASR-generated Captions 35 4.1.2 Scoring Criteria 36 4.1.4 Inter-group and Intra-group Analyses 43 4.2 Qualitative Analysis 47 4.2.1 Interview Responses 47 4.2.2 Advantages and Disadvantages of ASR-assisted SI 48 4.2.3 Interference of Audiovisual Discrepancies 57 4.2.4 Reasons behind Imported Errors in ASR-assisted SI 61 4.3 Summary 72 Chapter 5 Discussion and Conclusion 73 5.1 The Causality Between Strategic Choices and the Visual Dominance Effect 74 5.2 The Effectiveness of ASR-captions as a Visual Aid in SI 76 5.3 Findings 79 5.4 Limitations 80 5.5 Future Research Directions and Contribution 81 References 84 Appendix A: Speech Transcript 89 Appendix B: Speech Summary & Glossary 94 Appendix C: Experiment and Interview Guide 96 Appendix D: Informed Consent Form 97

    Chang, C.-C., & Schallert, D. (2007). The impact of directionality on Chinese/English simultaneous interpreting. Interpreting, 9, 137-176. https://doi.org/10.1075/intp.9.2.02cha
    Cheung, A., & Li, T. (2023). Machine-aided interpreting: An experiment of automatic speech recognition in simultaneous interpreting. 104, 1-20.
    Chmiel, A., Janikowski, P., & Lijewska, A. (2020). Multimodal processing in simultaneous interpreting with text. Target. International Journal of Translation Studies, 32(1), 37-58. https://doi.org/10.1075/target.18157.chm
    Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16(2), 409-412. https://doi.org/10.3758/BF03203962
    Collard, C., & Defrancq, B. (2018). Predictors of Ear-Voice Span, a corpus-based study with special reference to sex. Perspectives, 27, 1-24. https://doi.org/10.1080/0907676X.2018.1553199
    Defrancq, B., & Fantinuoli, C. (2020). Automatic speech recognition in the booth: Assessment of system performance, interpreters' performances and interactions in the context of numbers. Target, 33. https://doi.org/10.1075/target.19166.def
    Desmet, B., Vandierendonck, M., & Defrancq, B. (2018). Simultaneous interpretation of numbers and the impact of technological support. In. https://doi.org/10.5281/zenodo.1493281
    Errattahi, R., El Hannani, A., & Ouahmane, H. (2018). Automatic speech recognition errors detection and correction: A review. Procedia Computer Science, 128, 32-37. https://doi.org/https://doi.org/10.1016/j.procs.2018.03.005
    Fantinuoli, C. (2017). Speech Recognition in the Interpreter Workstation.
    Fantinuoli, C. (2021). Conference interpreting and new technologies. In The routledge handbook of conference interpreting (pp. 508-522). https://doi.org/10.4324/9780429297878-44
    Fantinuoli, C., & Montecchio, M. (2022). Defining maximum acceptable latency of AI-enhanced CAI tools.
    Gile, D. (1997). Conference interpreting as a cognitive management problem. In J. Danks, G. Shreve, S. Fountain, & M. McBeath (Eds.), Cognitive processes in translation and interpreting (pp. 196-214). Sage.
    Gile, D. (2009). Basic concepts and models for interpreter and translator training: Revised edition. https://doi.org/10.1075/btl.8
    Gile, D. (2017). Testing the Effort Models’ tightrope hypothesis in simultaneous interpreting - A Contribution. Hermes, 23, 153-172. https://doi.org/10.7146/hjlcb.v12i23.25553
    Gile, D. (2020). 2020 update of THE EFFORT MODELS and GRAVITATIONAL MODEL. https://doi.org/10.13140/RG.2.2.24895.94889
    Gile, D. (2023). THE EFFORT MODELS and GRAVITATIONAL MODEL Clarifications and update. https://doi.org/10.13140/RG.2.2.20178.43209
    Guo, M., Han, L., & Anacleto, M. (2022). Computer-assisted interpreting tools: Status quo and future trends. Theory and Practice in Language Studies, 13, 89-99. https://doi.org/10.17507/tpls.1301.11
    Lamberger-Felber, H. (2001). Text-oriented research into interpreting - Examples from a case-study. HERMES - Journal of Language and Communication in Business, 14(26), 39-64. https://doi.org/10.7146/hjlcb.v14i26.25638
    Lambert, S. (2004). Shared attention during sight translation, sight interpretation and simultaneous interpretation. Meta, 49(2), 294-306. https://doi.org/10.7202/009352ar
    Liu, M., Schallert, D., & Carroll, P. (2004). Working memory and expertise in simultaneous interpreting. Interpreting, 6, 19-42. https://doi.org/10.1075/intp.6.1.04liu
    Loomans, N. D. P. (2021). Error analysis in automatic speech recognition and machine translation [Master's thesis, Universidade de Lisboa].
    Ma, X., & Cheung, A. (2020). Language interference in English-Chinese simultaneous interpreting with and without text. Babel. Revue internationale de la traduction / International Journal of Translation, 66, 434-456. https://doi.org/10.1075/babel.00168.che
    Meuleman, C., & Van Besien, F. (2009). Coping with extreme speech conditions in simultaneous interpreting. Interpreting, 11, 20-34. https://doi.org/10.1075/intp.11.1.03meu
    Moser-Mercer, B. (2000). Simultaneous interpreting: Cognitive potential and limitations. Interpreting, 5, 83-94. https://doi.org/10.1075/intp.5.2.03mos
    Pisani, E., & Fantinuoli, C. (2021). Measuring the impact of automatic speech recognition on number rendition in simultaneous interpreting. In (pp. 181-197). https://doi.org/10.4324/9781003017400-14
    Prandi, B. (2023). Computer-assisted simultaneous interpreting: A cognitive-experimental study on terminology. Language Science Press.
    Schmid, C., Büchel, C., & Rose, M. (2011). The neural basis of visual dominance in the context of audio-visual object processing. NeuroImage, 55, 304-311. https://doi.org/10.1016/j.neuroimage.2010.11.051
    Seeber, K. (2011). Cognitive load in simultaneous interpreting: Existing theories - New models. Interpreting, 13, 176-204. https://doi.org/10.1075/intp.13.2.02see
    Seeber, K. (2017). Multimodal processing in simultaneous interpreting. In (pp. 461-475). https://doi.org/10.1002/9781119241485.ch25
    Seeber, K. G., Keller, L., & Hervais-Adelman, A. (2020). When the ear leads the eye–the use of text during simultaneous interpretation. Language, Cognition and Neuroscience, 35(10), 1480-1494.
    Soergel, D. (2005). Typology of errors in ASR transcription of oral history interviews. https://doi.org/10.13140/RG.2.1.3674.4562
    Wallinheimo, A. S., Evans, S. L., & Davitti, E. (2023). Training in new forms of human-AI interaction improves complex working memory and switching skills of language professionals. Frontiers in artificial intelligence, 6, 1253940. https://doi.org/10.3389/frai.2023.1253940
    Yuan, L., & Wang, B. (2023). Cognitive processing of the extra visual layer of live captioning in simultaneous interpreting. Triangulation of eye-tracked process and performance data. Ampersand, 11. https://doi.org/10.1016/j.amper.2023.100131

    下載圖示
    QR CODE