簡易檢索 / 詳目顯示

研究生: 廖文偉
Wen-Wei Liao
論文名稱: 應用虛擬題庫理論-電腦化方塊計數測驗之實作
Designing a Computer Cube Enumeration Test System based on Virtual Item Bank Theory
指導教授: 何榮桂
Ho, Rong-Guey
學位類別: 博士
Doctor
系所名稱: 資訊教育研究所
Graduate Institute of Information and Computer Education
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 115
中文關鍵詞: AIGCBTCTTVIB空間能力方塊計數
英文關鍵詞: Spatial Ability, Cube enumeration
論文種類: 學術論文
相關次數: 點閱:170下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 立方體(方塊)為體積的基本單位,因此常用來介紹體積概念。在測驗方面,方塊計數測驗,也常被用來測量或促進立體幾何物件的心像與操弄能力。方塊計數測驗通常隱密且不公開,一般人對其的瞭解也侷限於坊間書局所出版的練習題本。不公開之主要原因在於維持測驗的安全。本研究結合古典測驗理論(CTT)、電腦化測驗(CBT)、自動化試題產生理論(AIG)、虛擬題庫理論(VIB)及方塊計數測驗等相關理論,建立方塊計數測驗之虛擬題庫及方塊計數學習系統。方塊計數之虛擬題庫並無實際題目存在,取而代之的是物件及題目產生之方法。而在測驗當中,測驗受試者之題目,將由物件配合測驗理論直接產生,故題庫不會有試題曝光率之問題。本研究之受試者為國小六年級學生共267名學生,本研究最後發現:

    1. 題目可翻轉會導致試題難度降低,但可翻轉試題有助於空間能力之教學。
    2. 方塊完整度及方塊隱藏個數皆影響試題難度。
    3. 測驗的結果顯示男女學童在前青春期的空間視覺能力上並無顯著差異。
    4. 電腦化虛擬題庫方塊計數測驗亦可測量受試者之空間視覺能力。
    5. 應用方塊計數學習平台進行學習也有助於提升受試者空間能力。

    Cube is the basic unit of volume and hence it is often used to introduce its concepts. For tests, cube enumeration is always used to measure or promote mental images and manipulation ability of 3-D geometric objects. Cube enumeration tests usually are not open for the public whose understanding is only limited to exercise books found at general bookstores. The main reason for this limitation is to maintain the test security. This study combined Classical Test Theory (CTT), Computer Based Test (CBT), Automatic Item Generation (AIG), Virtual Item Bank Theory (VIB) and theories related to cube enumeration to construct a VIB-based cube enumeration test and a learning system. The item bank of the VIB-based cube enumeration test was a virtual one. The VIB contained only basic element – cube. With these cubes, items would be directly generated and therefore there wouldn’t be item exposure problem for the item bank. 267 sixth graders of New Taipei City elementary school participated in this experiment. Their mid-term math grades were used as the external criterion in the test.
    This study found out that:
    1. Both the number of invisible cubes and the integrity of cubes were positively correlated to item difficulty.
    2. In addition to the number of invisible cubes and the integrity of cubes, whether test items could be rotated also influenced item difficulty.
    3. Based on the results of both tests, there is no significant difference in spatial visualization ability between male and female preadolescence students.
    4. The scores of the computerized cube enumeration test also indicated that this system could measure their spatial visualization ability.
    5. Applying the cube enumeration learning system did improve users' spatial ability.

    Chapter 1. Introduction...............................1 1.1. Background and Motivation...............................1 1.2. Purposes...............................3 1.3. Research Questions...............................3 Chapter 2. Literature Review...............................5 2.1. Classical Test Theory (CTT)...............................5 2.1.1. Item Difficulty...............................6 2.1.2. Item Discrimination...............................6 2.2. Item Response Theory...............................8 2.3. Computer-Based Testing...............................13 2.4. Automatic Item-Generation (AIG)...............................15 2.5. Virtual Item Bank Theory...............................21 2.6. Cube enumeration...............................32 Chapter 3. Methods...............................39 3.1. Definition...............................39 3.2. Experimental Design...............................46 3.3. Procedures...............................48 3.4. Subjects...............................49 3.5. Research Hypotheses...............................51 3.6. Research Tools...............................53 Chapter 4. Results...............................85 4.1. Results of stage 1: Constructing CAT and analyzing the results...............................85 4.2. Results of stage 2: Constructing CBT and analyzing the results...............................86 4.3. Results of stage 3: The performance of learning system...............................90 4.4. Analyzing the security of the VIB...............................92 4.5. Discussion...............................94 Chapter 5. Conclusion and Suggestion...............................97 5.1. Conclusion...............................97 5.2. Suggestion...............................98 5.3. Limitation...............................99

    Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13(2), 101–125.
    Baker, F. B. (1985). The basics of item response theory. Portsmouth, New Hampshire: Heinemann.
    Baker, F. B. (1992). Item response theory: Parameter estimation techniques. New York: Marcel Dekker.
    Battista, M. T. (1999). Fifth graders’ enumeration of cubes in 3D arrays: Conceptual progress in an inquiry-based classroom. Journal for Research in Mathematics Education, 30, 417-448.
    Battista, M. T., & Clements, D. H. (1996). Student's Understanding of Three-Dimensional Rectangular Arrays of Cubes. Journal for Research in Mathematics Education, 27(3), 258-292.
    Battista, M. T., & Clements, D. H. (1998). Finding the number of cubes in rectangular cube buildings. Teaching Children Mathematics, 4(5), 258-264.
    Bejar, I. (2002). Generative testing: From conception to implementation. In S. H. Irvine & P. C. Kyllonen (Eds.), Item generation for test development, 199-217.
    Ben-Haim, D., Lappan, G., & Houang, R. T. (1985). Visualizing rectangular solids made of small cubes: Analyzing and effecting student's performance. Educational Studies in Mathematics, 16(4), 389-409.
    Chang, S. W., & Ansley, T. N. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71-103.
    Chen, H. J. (2003). Developing an online self-access English reading and vocabulary learning center. Paper presented at the Proceedings of the Eleventh International Conference on Computer Assisted Instruction, Taipei, Taiwan, 24-26.
    Chen, S. Y. (2004). Computer adaptive testing questions exposure control methods of test. Technology and ability to assess indicators International Symposium, 1, 31-41.
    Chen, S. Y., & Lei, P. W. (2005), Controlling item exposure and test overlap in computerized adaptive testing, Applied Psychological Measurement, 29, 204-217.
    Chen, S. Y., Lei, P. W., & Liao, W. (2008). Controlling item exposure and test overlap on the fly in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 61, 471-492.
    Chen, S. Y., & Lei, P. W. (2010). Investigating the relationship between item exposure and test overlap: Item sharing and item pooling. British Journal of Mathematical and Statistical Psychology, 63, 205-226.
    Chen, S. Z. (2007). Competency-based distribution of SHC exposure control method. Unpublished master dissertation, National Taichung University of Education, Taichung, Taiwan.
    Chen, S. Y., Lei, P. W., & Liao, W. (2008). Controlling item exposure and test overlap on the fly in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 61, 471-492.
    Chang, S. W., & Ansley, T. N. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71-103
    Chiang, C. T. (1984). Visual-spatial dimensions of cognitive ability. Taipei: National Changhua University of Education.
    Coniam, D. (1997). A preliminary inquiry into using corpus word frequency data in the automatic generation of English language cloze tests. Computer Assisted Language Instruction Consortium, 16 (2–4), 15–33.
    Cureton, E. E. (1957). The upper and lower twenty-seven percent rule. Psychometrika, 22, 293–296.
    Davey, T. & Parshall, C. G. (1995). New algorithms for item selection and exposure control with computerized adaptive testing. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, USA.
    Drasgow, F., Luecht, R. M., & Bennett, R. E. (2006). Technology and testing. In R. L. Brennan (Ed.), Educational measurement (pp. 471-515). Westport, CT: ACE/Praeger.
    Eliot, J. (1980). Classification of figural spatial tests. Perceptual and Motor skills, 51(1),847-851.
    Gao, Z. M., & Liu, C.L. (2003). A web-based assessment and profiling system for college English. Paper presented at the Proceedings of the Eleventh International Conference on Computer Assisted Instruction, Taipei, Taiwan, 24-26.
    Gibson, E. J., Brewer, P. W., Dholakia, A., Vouk, M. A., & Bitzer, D. L. (1995). A comparative analysis of Web-based testing and evaluation systems. Proceedings of the 4th WWW conference, Boston.
    Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 3847.
    Hambleton, R. K., Rogers, H. J., & Swaminathan, H. (1995). Fundamentals of item response theory. Newbury Park: Sage.
    Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer-Nijhoff.
    Harvey, R. J., & Hammer, A. L. (1999). Item response theory. The Counseling Psychologist, 27(3), 353–384.
    Ho, R.-G. (1989). Development and implementation of the CAI software database and interactive evaluation system. Paper presented at the Proceedings of the 1st 1989 International CAI Conference. Taipei, Taiwan. (Invited Speech)
    International Assessment of Educational Progress (IAEP). (1992). Learning Science. Princeton, NJ: Educational Testing Service.
    Irvine, S. H. & Kyllonen, P. (Eds.). (2002). Item Generation for test development. Mahwah, NJ: Lawrence Earlbaum Associates, Inc.
    Jansen, M. G. H. (2003). Estimating the parameters of a structural model for the latent traits in Rasch's model for speed tests. Applied Psychological Measurement, 27(2), 138–151.
    Jiang, P. -J. (2010). An Automatic Item Generation System Based on Structural Pattern. Unpublished master dissertation, Dayeh University, Taiwan.
    Kao, Zhao-Ming. (2000) AWETS: An automatic web-based English testing system. Paper presented at the Proceedings of the 8th International Conference on Computers in Education/ International Conference on Computer-Assisted Instruction. Taipei, Taiwan.
    Kelly, T. L. (1939). The selection of upper and lower groups for the validation of test items. Journal of Educational Psychology, 30, 17–24.
    Kline, T. J. B. (2005). Psychological testing: A practical approach to design and evaluation, Thousand Oaks, CA: Sage.
    Lai, H. , Alves, C., & Gierl M. J.(2009). Using Automatic Item Generation to Address Item Demands for CAT. Paper presented at the Proceedings of CAT Research and Applications Around the World Poster Session. Tokyo, Japan.
    Lawson, S. (1991). One parameter latent trait measurement: Do the results justify the effort? In B. Thompson (Ed.), Advances in educational research: Substantive findings, methodological developments (pp. 159-168). Greenwich, CT: JAI.
    Li, H. (2011). A Study on the Ontology-based Chinese Idiom Practice System. Unpublished master dissertation, National Chengchi University, Taiwan.
    Liao, W. W. (2002). Design a Virtual Item Bank based on image processing technique (Unpublished master dissertation). National Taiwan Normal University, Taipei, Taiwan.
    Liu, Z. J., Liang, R. K., & Lin, S. H. (2001), Automatic item-generation and online testing system for new figure reasoning test. In National Central University (Eds), 5th Global Chinese Conference on Computers in Education / International Conference on Computer-Assisted Instruction 2001 (pp. 326-333). Chung-Li: National Central University.
    Luh, W. M. (1999). Validation of Measurement on Spatial Abilities. Psychological Testing, 46(2), 101-111.
    Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbawn Associates.
    Lord, F. M., & Novick, M. R. (1968). Theory of mental test scores. Reading, MA: Addison-Wesley.
    Lord, F. M. (1983). Small n justifies the Rasch model. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing. New York: Academic Press.
    Meadows, M., & Billington, L. (2005). A Review of the Literature on Marking Reliability. AQA: Manchester.
    Mehtre, B. M., Kankanhalli, M. S., & Lee, W. F. (1998). Content-based image retrieval using composite color-shape approach, Information Processing & Management, 34(1), 109-120.
    Millman, J., & Westman, R. (1989). Computer-assisted writing of achievement items: toward a future technology. Journal of Educational Measurement, 26(2), 177-190.
    Mitkov, R. & Ha, L. A. (2003). Computer-aided generation of multiple-choice tests. Paper presented at the Proceedings of the HLT-NAACL 2003 Workshop on Building Educational Applications Using Natural Language Processing, Stroudsburg, PA, USA.
    Olkun, S., & Knaupp, J. E. (2010). Children’s understanding of rectangular solids made of small cubes. Germany: LAP Lambert Academic Publishing.
    Poel, C. J., & Weatherly, S. D. (1997). A cloze look at placement testing. Shiken: JALT (Japanese Assoc. for Language Teaching) Testing & Evaluation SIG Newsletter, 1 (1), 4–10.
    Pommerich, M. (2006). Validation of group domain score estimates using a test of domain. Journal of Educational Measurement, 43, 97–111.
    Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: The Danish Institute for Educational Research.
    Reckase, M. D. (1981). Tailored testing, measurement problems and latent trait theory. Paper presented at the annual meeting of the National Council for Measurement in Education, Los Angeles, U.S.A.
    Roever, C. (2001). Web-based language testing. Language Learning & Technology, 5(2), 84–94.
    Schulz, E. M., Kolen, M. J., & Nicewander, W. A. (1999). A rationale for defining achievement levels using IRT-estimated domain scores. Applied Psychological Measurement (23), 347–362.
    Schulz, E. M. & Lee, W.-C. (2002, April). Describing NAEP achievement levels with multiple domain scores. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA, U.S.A.
    Singley, M., & Bennett, R. (2002). Item generation and beyond: applications of schema theory to mathematics assessment. In S. H. Irvine & P. C. Kyllonen (Eds.). Item generation for test development, 361-384.
    Steven, V. (1991). Classroom concordancing: vocabulary materials derived from relevant authentic text. English for Specific Purposes, 10 (1), 35–46.
    Stocking, M. L., & Lewis, C. (1995). A new method of controlling item exposure in computerized adaptive testing. (Research Rep. 95-25). Princeton, NJ: Educational Testing Service.
    Stocking, M. L., & Lewis, C. (1998). Controlling item exposure conditional on ability in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 23, 57-75.
    Sympson, J. B., & Hetter, R. D. (1985). Controlling item-exposure rates in computerized adaptive testing.
    Wainer, H. et al. (Eds.). (1990). Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates.
    Wilson, E. (1997). The automatic generation of CALL exercises from general corpora. Paper presented at the Proceedings of the Conference on Teaching and Language corpora, 116–130.
    Yen, Y. C. (2010). A 4PL-Based Error-Correction Mechanism for Reviewable Computerized Adaptive Testing. Unpublished doctoral dissertation, National Taiwan Normal University, Taiwan.

    下載圖示
    QR CODE