研究生: |
陳建宇 Chen, Chien-Yu |
---|---|
論文名稱: |
英文詞彙測驗允許留白的試題反應模型之建構與檢測 Psychometric Models with Leaving Blanks in Vocabulary Levels Test |
指導教授: |
程代勒
Cheng, Tai-Le |
學位類別: |
碩士 Master |
系所名稱: |
數學系 Department of Mathematics |
論文出版年: | 2018 |
畢業學年度: | 106 |
語文別: | 英文 |
論文頁數: | 38 |
中文關鍵詞: | 英文詞彙測驗 、空白 、有限訊息適配度檢定法 |
英文關鍵詞: | Vocabulary levels test, Leaving blank, limited-information goodness-of-fit test |
DOI URL: | http://doi.org/10.6345/THE.NTNU.DM.002.2018.B01 |
論文種類: | 學術論文 |
相關次數: | 點閱:197 下載:8 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
英文詞彙測驗(Vocabulary level test)是一個有10個題組的測驗,其中每個題組有3小題6個選項。近期有個可處理體組內相依性的試題反應模型(VLT-sequence model)被提出,他的想法是學生在體組內的作答會受到作答順序的影響,但是關於學生留白的部分,此模型將空白部分視為答錯且隨機賦予一個錯誤選項,依模型假設表示將會使題組內作答順序在後面的題目的選項給減少,明顯不合理。因此本研究在於發展一個能夠處理這部分的模型,處理方法為對於每個小題假設一個留白的參數;透過模擬確認最大概似估計及有限訊息適配度檢定方法的有效性,並且發現原本模型在估計新模型生成的資料上表現並不差;在實際資料(3000詞彙量)的估計上發現新模型無法通過檢定,但問題可能是在將不正確的部分區分為有選錯誤選項及空白兩種,於是將新模型的兩部分合併後與原模型及未考慮相依性的二參試題反應模型(two-parameter item response model)比較,還是原模型的表現較好,又發現有些人整體答對率高過7成但卻有至少1個題組留白,表示僅對每小題假設一個參數似乎並不足以處理如此複雜的空白情形。
Vocabulary Level Test (VLT) is an English test with ten clusters, each containing three items and six options, to check the students’ vocabulary level. A VLT-sequence model (VSM) has been proposed to handle the dependence among items within each cluster by considering the response order of the examinees in each cluster. However, blanks are treated as the same as the incorrect responses, ignoring the fact that, unlike incorrect
responses, they do not make changes to the number of remaining set of items within a cluster. In this thesis, we try to distinguish leaving blanks from giving incorrect responses and propose the VLT-sequence model with leaving blanks (VSMB) by adding a blank parameter to each item. Through simulations, we verify the validity of using the marginal maximum likelihood estimation for the proposed VSMB model, as well as adopting the limited information method to check model fit. The simulation results suggest that the VSM model fit the VSMB data reasonably well with the acceptable estimation errors and the low proportions of rejection. While analyzing the 3000-level VLT data, we also find that the VSM model appears to provide the most parsimonious fit, in comparison to the proposed VSMB model and the two-parameter item response model. Moreover, the analysis of empirical data also reveals that the single blank parameter in the VSMB model does not seem to well capture the mechanism for leaving blanks while observing that some examinees with more than 70% of correct responses on the test leave at least one whole cluster with easier items blank. That is, other factors such as motivation might play a role too.
Abramowitz, M. and Stegun, I.A.(1972). Handbook of mathematical functions, 10th printing. Washington, DC:U.S. Government Printing Office.
Broyden, C. G. (1970). The convergence of a class of double-rank minimization algorithms: 2. The new algorithm. IMA journal of applied mathematics, 6(3), 222-231.
Fletcher, R. (1970). A new approach to variable metric algorithms. The computer journal, 13(3), 317-322.
Goldfarb, D. (1970). A family of variable-metric methods derived by variational means. Mathematics of computation, 24(109), 23-26.
Jenkins, P., Earle-Richardson, G., Burdick, P., & May, J. (2007). Handling nonresponse in surveys: analytic corrections compared with converting nonresponders. American journal of epidemiology, 167(3), 369-374.
Kolmogorov, A.N.(1956). Foundations of the theory of probability, second english edition, Chelsea, NY: Chelsea publishing company.
Lai, G.D.(2016). Psychometric models for local dependency in vocabulary levels test (master's thesis)(in Chinese). National Taiwan Normal University, Taipei, Taiwan.
Laufer, B. and Nation, P. (1999). A vocabulary-size test of controlled productive ability. Language testing, 16(1), 33-51.
Lynn, P. (1996). Weighting for non-response. Survey and statistical computing, 205-214.
Maydeu-Olivares, A., & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71(4), 713-732.
Maydeu-Olivares, A.(2013). Goodness-of-fit assessment of item response theory models. Measurement, 11, 71-101.
McInnis, E. D. (2006). Nonresponse bias in student assessment surveys: A comparison of respondents and non-respondents of the national survey of student engagement at an independent comprehensive Catholic University (Doctoral dissertation). Marywood University, Scranton, Pennsylvania.
Nation, I.S.P.(1983). Testing and teaching vocabulary. Guidelines, 5(1), 12-25.
Nation, I.S.P.(1990). Teaching and learning vocabulary. Boston, MA: Heinle and Heinle.
Schmitt, N., Schmitt, D., & Clapham, C.(2001). Developing and exploring the behaviour of two new versions of the Vocabulary Levels Test. Language Testing, 18(1), 55-88.
Shanno, D. F. (1970). Conditioning of quasi-Newton methods for function minimization. Mathematics of computation, 24(111), 647-656.
Smirnov, N. (1948). Table for estimating the goodness of fit of empirical distributions. The annals of mathematical statistics, 19(2), 279-281.