簡易檢索 / 詳目顯示

研究生: 陳麗如
Li-Ju Chen
論文名稱: 應用試題反應理論發展與驗證一單字階層測驗
An Application of Item Response Theory to Developing and Validating a Vocabulary Levels Test
指導教授: 曾文鐽
Tseng, Wen-Ta
學位類別: 碩士
Master
系所名稱: 英語學系
Department of English
論文出版年: 2011
畢業學年度: 99
語文別: 英文
論文頁數: 136
中文關鍵詞: 試題反應理論英文單字量測驗英文字彙頻率英文單字難度
英文關鍵詞: Item Response Theory, English vocabulary size test, word frequency, word difficulty
論文種類: 學術論文
相關次數: 點閱:194下載:12
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 大量的字彙量對精通第二外語而言極為重要,由於單字量在語言教學中扮演重要的角色,因此單字量測驗有其必要,以來檢測與評估學習者在單字學習上的進步與成就。儘管如Vocabulary Levels Test和Checklist Test等英語單字量測驗的普遍使用,這些測驗對台灣的英語學習者並不是非常適當,因為這些測驗的單字所源自的字表的編製,並未考量台灣學生的學習環境和文化背景。因此本研究旨在根據大考中心所公布的6,480參考字彙,以階層測驗的形式,發展一個適當的英文單字量測驗並應用試題反應理論之三參模式(three-parameter logistic Item Response Theory model)來分析並驗證此測驗的品質。隨著測驗的建置和驗證,本研究更進一步根據三參模式估計之階層潛在難易度來探究單字頻率和潛在難度的關係,受試者包含來自台北、桃園、彰化縣的六所高中,共1060人,高一、高二、高三的人數大約均等。研究結果顯示該試題的信度可由高實證信度(0.9882)所支持,效度被反應在良好的建構效度和階級效度,此外,適配度分析(fit analysis)顯示試題品質大致而言良好,關於階層潛在難度,六個階層難度呈現非線性關係,難度從第一階層到第三階層急劇上升,從第四階層到第六階層則是逐步上升。
    研究發現此研究所發展的英文單字量測驗可被視為適切的測驗,單字頻率和單字難易度並非協同地發展,此外,本研究指出階級一到階級三的單字是教師應該力推學生優先習得的基本的單字表,因為「階級三」是個轉捩點,是通往最終英文單字量目標的重要關卡。本研究期望隨著英語單字量測驗的建立及階層潛在難度的探究,教育者和學習者能洞察單字學習過程,因此能釐清如何促進單字學習目標的達成。

    A substantial vocabulary size is central to mastering a second language. Due to its vital role in language teaching, vocabulary size tests are required to monitor learners’ progress and to assess their achievement in vocabulary learning. Despite the wide use of English vocabulary size tests such as Vocabulary Levels Test and the Checklist test, these tests may not be eminently suitable for English learners in Taiwan because the sources of the vocabulary in these tests are from word frequency lists compiled without considerations of the learning environment and cultural background in Taiwan. This study, then, aims to develop a proper measurement of English vocabulary size of learners in Taiwan in the form of levels test based on the 6,480 reference word list published by College Entrance Examination Center and to validate it in terms of the empirical evidence collected by using three-parameter logistic Item Response Theory Model (3PL IRT model). With the construction and validation of the test, this study further attempts to explore the relationship between word frequency and its latent difficulty based on the level latent difficulty estimated by 3PL IRT model. Participants in this study were 1,060 senior high school students from six senior high schools in Taipei, Taoyuan, and Changhua County, with the number of first, second, and third graders roughly the same. Results showed that the reliability of the test could be observed in its high empirical reliability (0.9892) and that the validity was reflected in its good construct validity and hierarchical validity. Besides, fit analysis indicated that the test items were of good quality overall. As to the level latent difficulty, it exhibits a non-linear continuum across the six levels, with a sharp rise from Level 1to Level 3 and a gradual increase from Level 3 to Level 6.
    The findings suggest that the English vocabulary size test developed in this study can be considered as an adequate measurement. Word frequency does not work synergistically with its latent difficulty. Furthermore, the study reveals that words of level one to three are the fundamental and priority word list that teachers need to urge senior high school students to acquire as Level Three is a pivotal point, the door to the fulfillment of senior high school students’ ultimate English vocabulary size goals. With the development of the English vocabulary size measurement and the investigation of level latent difficulty, it is hoped that both educators and learners can gain insight into the vocabulary learning process and thus have a clear view on how to facilitate the achievement of the vocabulary learning goal.

    CHINESE ABSTRACT i ABSTRACT iii ACKNOWLEDGEMENTS v TABLE OF CONTENTS vii LIST OF TABLES viii LIST OF FIGURES viii CHAPTER ONE INTRODUCTION 1 Background and Motivation 1 Research Questions 6 Significance of the Study 7 Definition of Vocabulary 8 Organization of the Thesis 8 CHAPTER TWO LITERATURE REVIEW 10 Vocabulary Knowledge 10 Vocabulary Size 19 Vocabulary Measurement 29 Measurement Theory 47 CHAPTER THREE METHOD 60 The CEEC Vocabulary Levels Test 60 Test Administration 62 CHAPTER FOUR RESULTS 65 Item Analysis of the CEEC Vocabulary Levels Test 65 Other Aspects of Test Quality 69 Difficulty Estimates of the Six Levels 71 CHAPTER FIVE DISCUSSION 75 Overview of the Study 75 The Test Quality 76 Level Difficulty 80 CHAPTER SIX CONCLUSION 84 Summary of Major Findings 84 Implications 85 Limitations of the Study 87 Suggestions for Future Research 88 REFERENCES 90 APPENDIX A The CEEC Vocabulary Levels Test 102 APPENDIX B The Item Characteristic Curves for the 180 Items 109 APPENDIX C Item Parameter Estimates and Fit Statistics of the 180 Items 127

    Adolphs, S., & Schmitt, N. (2003). Lexical coverage of spoken discourse. Applied Linguistics, 24, 425-438.
    Aitchison, J. (2003). Words in the mind (3rd ed.). Oxford: Blackwell.
    Aizawa, K. (2006). Rethinking frequency markers for English-Japanese dictionaries. In Murata, M., Minamide, K., Tono, Y., & Ishikawa, S. (Eds.), English Lexicography in Japan (pp. 108-119). Tokyo: Taishukan-shoten.
    Albrechtsen, D., Haastrup, K., & Henriksen, B. (2008). Vocabulary and writing in a first and second language: Process and development. Basingstoke: Palgrave MacMillan.
    Alderson, J. C. (2005). Diagnosing foreign language proficiency. London: Continuum.
    Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge, England: Cambridge University Press.
    Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Monterey, California: Brooks/Cole Publishing Company.
    Anderson, R. C., & Freebody, P. (1981). Reading comprehension and the assessment and acquisition of word knowledge. In Hutson, B. A. (Ed.), Advances in reading/language research (pp. 132-255). Greenwich, CT: JAI Press.
    Bachman, L. F. (1990). Fundamental considerations in language testing. Hong Kong: Oxford University Press.
    Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge, England: Cambridge University Press.
    Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: designing and developing useful language tests. Oxford, England: Oxford University Press.
    Baghaei, P. (2008). The Rasch model as a construct validation tool. Rasch Measurement Transactions, 22. Retrieved from http://www.rasch.org/rmt/rmt221a.htm
    Bauer, L., & Nation, P. (1993). Word families. International Journal of Lexicography, 6, 253-279.
    Beglar, D. (2010). A Rasch-based validation of the vocabulary size test. Language Testing, 27, 101-118.
    Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing, 16, 131-162.
    Bertram, R., Baayen, R., & Schreuder, R. (2000). Effects of family size for complex words. Journal of Memory and Language, 42, 390-405.
    Betram, R., Laine, M., & Virkkala, M. (2000). The role of derivational morphology in vocabulary acquisition: Get by with a little help from my morpheme friends. Scandinavian Journal of Psychology, 41, 287-296.
    Bonk, W. J. (2000). Second language lexical knowledge and listening comprehension. International Journal of Listening, 14, 14-31.
    Brennan, R. L. (2001). An essay on the history and future of reliability from the perspective of replications. Journal of Educational Measurement, 38, 295-317.
    Brown, J.D., & Hudson, T. (2002). Criterion-referenced language testing. Cambridge, England:vCambridge University Press.
    Campion, M. E., & Elley, W. B. (1971). An academic vocabulary list. Wellington, New Zealand: New Zealand Council for Educational Research.
    Carroll, J. B., Davies, P., & Richman, B. (1971). The American heritage word frequency book. New York: American Heritage Publishing.
    Childs, R. A., & Oppler, S. H. (2000). Implications of test dimensionality for unidimensional IRT scoring: An investigation of a high-stakes testing program. Educational and Psychological Measurement, 60, 939-955.
    College Entrance Examination Center. (2002). 大考中心高中英文參考詞彙表. Retrieved from http://www.ceec.edu.tw/Research/ResearchList.htm
    Coxhead, A. (1998). An academic word list. English Language Institute Occasional Publication Number 18. Wellington: Victoria University of Wellington.
    Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34, 213-38.
    Cronbach, L. J. & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bull, 52, 281-302.
    Daller, H., Milton, J., & Treffers-Daller, J. (2007). Modelling and assessing vocabulary knowledge. Cambridge, England: Cambridge University Press.
    D’Anna, C. A., Zechmeister, E. B., & Hall, J. W. (1991). Toward a meaningful definition of vocabulary size. Journal of Reading Behavior, 23, 109-122.
    Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). Dictionary of language testing. Cambridge, England: Cambridge University Press.
    De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.
    Doughty, C. J. (2003). Instructed SLA: Constraints, compensation, and enhancement. In Doughty, C. J., & Long, M.H. (Eds.), The handbook of second language acquisition (pp. 256-310). Malden, MA: Blackwell.
    Goulden, R., Nation, P., & Read, J. (1990). How large can a receptive vocabulary be? Applied Linguistics, 11, 341-363.
    Hambleton, R. K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications.
    Hanna, G. S., & Dettmer, P. A. (2004). Assessment for effective teaching: Using context-adaptive planning. Boston: Pearson A and B.
    Heaton, J. B. (1975). Writing English language tests. New York: Longman.
    Heaton, J. B. (1990). Classroom testing. New York: Longman.
    Henriksen, B. (1999). Three dimensions of vocabulary development. Studies in Second Language Acquisition, 21, 303-317.
    Hirsh, D., & Nation, P. (1992). What vocabulary size is needed to read unsimplified texts for pleasure? Reading in a Foreign Language, 8, 689-696.
    Hu, M., & Nation, P. (2000). Vocabulary density and reading comprehension. Reading in a Foreign Language, 23, 403-430.
    Hughes, A. (2003). Testing for language teachers. Cambridge, England: Cambridge University Press.
    Huibregtse, I., Admiraal, W., & Meara, P. (2002). Scores on a yes-no vocabulary test: Correction for guessing and response style. Language Testing, 19, 227-245.
    Hwang, K. (1989). Reading newspapers for the improvement of vocabulary and reading skills (Unpublished master’s thesis). Victoria University of Wellington, New Zealand.
    Kučera, H., & Francis, W. N. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.
    Laufer, B. (1989). What percentage of text-lexis is essential for comprehension? In Lauren, C., & Nordman, M. (Eds.), Special language: From humans thinking to thinking machines (pp. 316-323). Clevedon, England: Multilingual Matters.
    Laufer, B. (1992). How much lexis is necessary for reading comprehension? In Arnaud, P. J. L., & Béjoint, H. (Eds.), Vocabulary and applied linguistics (pp. 126-132). London: Macmillan.
    Laufer, B. (1997a). What’s in a word that makes it hard or easy? Intralexical factors affecting the difficulty of vocabulary acquisition. In Schmitt, N., & McCarthy, M. (Eds.), Vocabulary: Description, acquisition, and pedagogy (pp. 140-155). Cambridge, England: Cambridge University Press.
    Laufer, B. (1997b). The lexical plight in second language reading: Words you don't know, words you think you know and words you can't guess. In Coady, J., & Huckin, T. (Eds.), Second language vocabulary acquisition (pp. 20-34). Cambridge, England: Cambridge University Press.
    Laufer, B. (2000). Task effect on instructed vocabulary learning: The hypothesis of ‘involvement’. Selected Papers from AILA ’99 Tokyo (pp. 47-62). Tokyo: Waseda University Press.
    Laufer, B., Elder, C., Hill, K., & Congdon, P. (2004). Size and strength: Do we need both to measure vocabulary knowledge? Language Testing, 21, 202-226.
    Laufer, B., & Goldstein, Z. (2004). Testing vocabulary knowledge: Size, strength, and computer adaptiveness. Language Learning, 54, 399-436.
    Laufer, B., & Nation, P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16, 307-322.
    Laufer, B., & Nation, P. (1999). A vocabulary-size test of controlled productive vocabulary. Language Testing, 16, 33-51.
    Leech, G., Rayson, P., & Wilson, A. (2001). Word frequencies in written and spoken English. Harlow, England: Longman.
    Linn, R.L., & Gronlund, N.E. (2000). Measurement and assessment in teaching (8th ed.). Upper Saddle River, NJ: Prentice-Hall.
    Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum Associates.
    McNamara, T. (1996). Measuring second language performance. London: Longman.
    McNamara, T. (2000). Language testing. New York: Oxford University Press.
    Meara, P. M. (1992). EFL vocabulary tests. Swansea: Centre for Applied Language Studies, University of Wales.
    Meara, P. M. (1996). The dimensions of lexical competence. In Brown, G., Malmkjaer, K., & Williams, J. (Eds.), Performance and Competence in Second Language Acquisition. (pp. 35-53). Cambridge, England: Cambridge University Press.
    Meara, P. M., & Jones, G. (1988). Vocabulary size as a placement indicator. In Grunwell, P. (Ed.), Applied linguistics in Society (pp. 80-87). London: Centre for Information on Language Teaching and Research.
    Meara, P. M., & Jones, G. (1990). Eurocentres vocabulary size test, version E1.1/K10. Zurich, Switzerland: Eurocentres Learning Service.
    Meara, P.M., & Milton, J. (2003). X_Lex, the Swansea levels test. Newbury, England: Express.
    Melka, F. (1997). Receptive vs. productive aspects of vocabulary. In Schmitt, N., & McCarthy, M. (Eds.), Vocabulary: Description, acquisition, and pedagogy (pp. 84-102). Cambridge, England: Cambridge University Press.
    Messick, S. (1989). Validity. In Linn, R. (Ed.), Educational Measurement (3rd ed.). Washington, D.C.: American Council on Education.
    Milton, J. (2006). X_Lex: The Swansea vocabulary levels test. In Coombe, C., Davidson, P., & Lloyd, D. (Eds.), Proceedings of the 7th and 8th Current Trends in English Language Testing (CTELT) Conference: Vol. 4. (pp. 29-39). UAE: TESOL Arabia.
    Milton, J. (2009). Measuring second language vocabulary acquisition. Bristol, England: Multilingual Matters.
    Milton, J., & Daller, H.M. (2007). The interface between theory and learning in vocabulary acquisition. Paper presented at EUROSLA 2008, Newcastle, UK.
    Mochida, A., & Harrington, M. (2006). The yes/no test as a measure of receptive vocabulary knowledge. Language Testing 23, 73-98.
    Nation, P. (1983). Testing and teaching vocabulary. Guidelines, 5, 12-25.
    Nation, P. (1990). Teaching and learning vocabulary. New York: Newbury House.
    Nation, P. (1993). Using dictionaries to estimate vocabulary size: Essential, but rarely followed, procedures. Language Testing, 10, 27-40.
    Nation, P. (2001). Learning vocabulary in another language. Cambridge, England: Cambridge University Press.
    Nation, P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63, 59-82.
    Nation, P. (2008). Teaching vocabulary: Strategies and techniques. Boston: Heinle.
    Nation, P., & Gu, P. Y. (2007). Focus on vocabulary. Sydney, Australia: National Centre for English Language Teaching and Research.
    Nation, p., & Hwang, K. (1995). Where would general service vocabulary stop and special purposes vocabulary begin? System, 23, 35-41.
    Nation, P., & Waring, R. (1997). Vocabulary size, text coverage, and word lists. In Schmitt, N., & McCarthy, M. (Eds.), Vocabulary: Description, acquisition, and pedagogy (pp. 6-19). Cambridge, England: Cambridge University Press.
    Palmer, H. E. (1917). The scientific study and teaching of languages. London: Harrap.
    Paribakht, T.S., & Wesche, M. (1997). Vocabulary enhancement activities and reading for meaning in second language vocabulary acquisition. In Coady, J., & Huckin, T. (Eds.), Second language vocabulary acquisition (pp. 174-200). Cambridge, England: Cambridge University Press.
    Rasch Measurement Software and Publications. (2010). Reliability and separation of measures. Retrieved from http://www.winsteps.com/winman/index.htm?reliability.htm
    Read, J. (1988). Measuring the vocabulary knowledge of second language learners. RELC Journal, 19, 12-25.
    Read, J. (2000). Assessing vocabulary. Cambridge, England: Cambridge University Press.
    Read, J. (2004). Research in teaching vocabulary. Annual Review of Applied Linguistics, 24, 146-161.
    Richards, B.J., & Malvern, D.D. (2007). Validity and threats to the validity of vocabulary measurement. In Daller, H., Milton, J., & Treffers-Daller, J. (Eds.), Modelling and Assessing Vocabulary Knowledge (pp 79-92). Cambridge: Cambridge University Press.
    Richards, J. C. (1976). The role of vocabulary teaching. TESOL Quarterly, 10, 77-89.
    Schmitt, N. (2000). Vocabulary in language teaching. Cambridge: Cambridge University Press.
    Schmitt, N. (2008). Instructed second language vocabulary learning. Language Teaching Research, 12, 329-363.
    Schmitt, N. (2010). Researching vocabulary: A vocabulary research manual. Basingstoke, U.K.: Palgrave Macmillan.
    Schmitt, N., Schmitt, D., & Clapham, C. (2001). Developing and exploring the behaviour of two new versions of the Vocabulary Levels Test. Language Testing, 18, 55-88.
    Schmitt, N., & Zimmerman, C. B. (2002). Derivative word forms: What do learners know? TESOL Quarterly, 36, 145-171.
    Schonell, F., Meddleton, I., Shaw, B., Routh, M., Popham, D., Gill, G., et al. (1956). A Study of the Oral Vocabulary of Adults. Brisbane and London: University of Queensland Press/ University of London Press.
    Shillaw, J. (1999). The application of Rasch modelling to yes/no vocabulary tests. (Unpublished Doctoral dissertation). University of Wales, Swansea, U.K.
    Staehr, L. (2009). Vocabulary knowledge and advanced listening comprehension in English as a foreign language. Studies in Second Language Acquisition, 31, 1-31.
    Sutarsyah, C., Nation, P., & Kennedy, G. (1994). How useful is EAP vocabulary for ESP? A corpus-based study. RELC Journal, 25, 34-50.
    Thorndike, E.L., & Lorge, I. (1944). The teacher’s word book of 30,000 words. New York: Teachers College, Columbia University.
    Waring, R. (1999). Tasks for assessing second language receptive and productive vocabulary. (Unpublished Doctoral dissertation). University of Wales, Swansea, U.K.
    West, M. (1953). A general service list of English words. London: Longman, Green and Co.
    Wilkins, D. A. (1972). Linguistics in language teaching. London: Arnold.
    Wu, Y. Y. (2010). Confirming the expert-generated content validity through the latent trait model. (Unpublished master’s thesis). National Taiwan Normal University, Taipei, Taiwan.
    Xue, G., & Nation, P. (1984). A university word list. Language Learning and Communication, 3, 215-229.
    Zechmeister, E.B., Chronis, A.M., Cull, W.L., D’Anna, C.A., & Healy, N.A. (1995). Growth of a functionally important lexicon. Journal of Reading Behavior, 27, 201-212.

    QR CODE