研究生: 黃鈺真
Yu-Chen Huang
論文名稱: 臺灣與大陸英語學習者語料庫之介詞錯誤研究
A Study on Prepositional Errors in Taiwanese and Chinese Learners' English Corpora
指導教授: 陳浩然
Chen, Hao-Jan
學位類別: 碩士
系所名稱: 英語學系
Department of English
論文出版年: 2011
畢業學年度: 99
語文別: 英文
論文頁數: 97
中文關鍵詞: 介詞錯誤學習者語料庫半自動資料抽取
英文關鍵詞: preposition, error, learner corpus, semi-automatic data extraction
論文種類: 學術論文
相關次數: 點閱:184下載:26
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究的目的為找出臺灣與大陸英語學習者語料庫中常見的介詞錯誤,比對兩方錯誤相似處,並試圖將錯誤加以分類。研究者將焦點放在發生於動詞+介詞、介詞+名詞與形容詞+介詞三種組合中的介詞錯誤,並採用半自動方法來抽取錯誤。研究的語料庫有二,一者是180多萬字的臺灣英語學習者語料庫,另一者是340多萬字的大陸英語學習者語料庫。做為比較基準的英語母語人士語料庫則包含了英國國家語料庫(BNC) 與紐約時報語料庫(NYT)。研究者首先找出三種目標介詞組合所有可能的排序,使用Monoconc Pro分別抓取學習者語料庫與基準語料庫中的三種目標介詞組合,再來利用Perl程式比對兩方抽取資料,過濾出學習者可疑的介詞組合,最後再由研究者做人工檢視。
    研究結果顯示,台灣學習者常見的介詞誤用有282筆,多用有162筆,大陸學習者常見的介詞誤用有1070筆,多用則有139筆。兩邊學習者所犯的錯誤整體而言十分相似,臺灣學習者所犯的誤用與多用錯誤有半數以上出現在大陸學習者語料庫,兩邊前五大介詞誤用更是有四個相同,包含了*in campus, *in (the) Internet, *by…way 和 *in the other hand。將所有的誤用進行分類,研究者發現雙方學習者前五大最常犯錯的類別有三個相同,亦即是空間(Space)、抽象空間(Abstract Space)與方法(Manner)。研究者推測學習者所犯的介詞錯誤大致可歸因於母語與目標語的差異以及對目標語規則的不知悉。

    The aim of the study is to investigate the common English prepositional errors in a Taiwanese learners’ corpus and a Chinese learners’ corpus. In previous error-analysis studies, few researchers made comprehensive and detailed discussion of learners’ common prepositional errors. Moreover, most researchers identified errors manually, which is actually quite laborious and time-consuming. Noticing these limitations, this study adopted a semi-automatic way to extract prepositional errors. The researcher focused on prepositional errors that occur in three combinations: V + Prep., Prep. + N and Adj + Pp. The errors discovered in two learner corpora were compared to check if there were any similarities. The researcher also classified all the errors to see in what aspects learners have difficulty in using prepositions.
    The research material included a 1.8-million-word Taiwanese learner corpus and a 3.4-million-word Chinese learner corpus. Two native speaker corpora, BNC and NYT, were adopted as reference corpora. The researcher first listed all the possible patterns of the three target combinations, keying them onto the concordancing program Monoconc Pro to retrieve prepositional instances in the learner corpora and reference corpora. Then, the program Perl was used to compare the extracted data, capturing instances that only appeared in the learner corpora. Finally, the researcher manually inspected these suspicious instances and judged whether they contained real prepositional errors.
    In total, 282 tokens of common misuses and 162 common extraneous uses were found in the TW corpus. 1070 tokens of common misuses and 139 tokens of common extraneous uses were found in the CN corpus. Two groups of learners shared lots of similarities in errors. Over half of the misuses and extraneous uses found in the TW corpus also appeared in the CN corpus. Among the top five misuses in respective learner corpora, four of them were even shared, which is *in campus, *in (the) Internet, *by…way and *in the other hand. Classifying all the misuses, it was discovered that among the top five misuse categories in each learner corpus, three of them were the same, including space, abstract space and manner. It is assumed by the researcher that all the discovered errors could roughly be attributed to interlingual factors and an ignorance of rule restriction.
    The prepositional errors discovered in this study can be used in improving automatic writing error detection systems, making the checking of prepositional errors perform better. It is also hoped that these errors can give English teachers some insights into the common prepositional errors made by learners with Chinese as L1, serving as useful reference for English teachers and learners.

    摘要 i ABSTRACT iii ACKNOWLEDGEMENTS v TABLE OF CONTENTS vi LIST OF TABLES viii LIST OF FIGURES ix CHAPTER ONE INTRODUCTION 1 1.1 General Background 1 1.2 Problems of Pervious Research 4 1.3 Purpose of the Study 5 1.4 Research Questions 7 1.5 Definition of Key Terms 7 1.6 Organization of the Thesis 8 CHAPTER TWO LITERATURE REVIEW 9 2.1 Previous Studies on Learners’ Use of Prepositions 9 2.1.1 The Underuse and Overuse of Prepositions 9 2.1.2 The Misuse of Prepositions 12 2.2 Corpora and Corpus-based Research 24 CHAPTER THREE METHODOLOGY 29 3.1 The Corpora Used in this Study 29 3.2 Tools 31 3.3 Data Extraction 33 3.3.1 The Patterns of V-Pp, Pp-N and Adj-Pp 33 3.3.2 The Semi-automatic Data Extraction 40 3.4 Data Analysis 44 CHAPTER FOUR RESULTS & DISCUSSION 45 4.1 Results 45 4.1.1 The Common Prepositional Errors Found in Leaner Corpora 45 The common prepositional errors found in the TW corpus 45 The common prepositional errors found in the CN corpus 47 4.1.2 Similarities between Prepositional Errors Made by TW and CN Learners 51 4.1.3 Classification of Errors Found in Two Learner Corpora 52 4.2 Discussion 54 4.2.1 The Common Prepositional Errors Found in Leaner Corpora 54 4.2.2 Similarities between Prepositional Errors Made by TW and CN Learners 59 4.2.3 Classification of Errors Found in Two Learner Corpora 61 CHAPTER FIVE CONCLUSION 70 5.1 Conclusion 70 5.2 Implications 72 5.3 Limitations of the Study 74 REFERENCES 76 APPENDIXES 82 Appendix A The Penn-Treebank Tagset 82 Appendix B Preposition Confusion in the TW corpus 84 Appendix C Preposition Confusion in the CN corpus 86 Appendix D The Possible Causes of Errors 88 Appendix E A Digest of Errors with Original Context 93

