研究生: |
曾繁萍 Fan-ping Tseng |
---|---|
論文名稱: |
教師命題過程與學生答題過程研究 Exploring Teachers' Test-constructing Processes and Students' Test-taking Processes |
指導教授: |
程玉秀
Cheng, Yuh-Show |
學位類別: |
博士 Doctor |
系所名稱: |
英語學系 Department of English |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 英文 |
論文頁數: | 274 |
中文關鍵詞: | 試題命製 、試題命製過程 、答題過程 、策略運用 、字彙測驗 、綜合測驗 、閱讀測驗 、有聲思考法 |
英文關鍵詞: | test-construction, test-constructing process, test-taking process, strategy use, vocabulary test, cloze test, reading comprehension test, think-aloud |
論文種類: | 學術論文 |
相關次數: | 點閱:223 下載:23 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本文旨在研究以下三個問題:第一,高中教師如何命一份英文學科能力測驗的模擬試題?資深教師與新手教師在命題時的考慮點有何不同?第二,高中學生如何回答英文學科能力測驗模擬試題的題目?高程度學生與低程度學生的答題策略有何不同?第三,學生答題時的考慮點與教師命題時的考慮點是否一致?
四位高中英文教師及四十八位高中學生參與此研究。教師的任務是要命一份英文學測模擬試題,內含詞彙測驗、綜合測驗、及閱讀測驗等共二十八題選擇題;學生的任務則是要回答教師所命的模擬試題題目。所有參與者在執行任務時,必須要進行有聲思考法,以作為本研究的主要分析資料。
本研究主要結果如下。首先,資深教師與新手教師在命題時的考慮點略有不同;資深教師的命題考量較以學生為中心,而新手教師的命題考量則較符合評量上的命題原則。此外,資深教師所命的試題並沒有優於新手教師;而且,在四位教師所命的題目中,有不少試題是被專家評定為有暇疵、不合適,並需要修正及改進的。
其次,學生在作答不同類型題目時,大致上會採用不同的策略。然而,學生在作答三種類型的題目時,均有使用「消去法」。此結果顯示,消去法乃學生在本研究最常使用的答題策略。另外,高程度學生比低程度學生較常使用字彙及文法知識和演繹思考法來作答;而低程度學生比高程度學生較常利用「猜測法」來回答任何類型的題目。
研究也發現,學生的答題考慮點與教師的命題考慮點大不相同,兩者的一致率只有33%。此外,學生的想法和新手教師的想法較一致,而和資深教師的想法較不相同。高程度學生在綜合測驗的答題考慮點上,和教師們的命題考慮點出入較大;而低程度學生在閱讀測驗的考慮點上,和教師們的考慮點不一致性較高。
This study aims to investigate three research questions. First, how did experienced and novice teachers construct mock tests for the Scholastic Ability English Test (SAET)? Second, how did higher- and lower-proficiency students take those mock tests? Third, were students’ considerations for answering the tests consistent with teachers’ test-constructing considerations?
Four senior high school teachers and forty-eight senior high school students participated in this study. All participants were asked to do think-aloud while performing their tasks. The teachers were asked to construct twenty-eight items of multiple-choice questions on vocabulary, cloze, and reading comprehension. The students were asked to answer the questions constructed by the teachers.
Major findings of this study are summarized as follows. First, the experienced teachers and novice teachers seemed to make different types of considerations in constructing their tests. The experienced teachers took more student-oriented factors into account while the novice teachers took more test-construction principles into consideration. Despite their different considerations in test-constructing processes, the two experienced teachers did not seem to produce better test items than the two novice teachers. All four teachers had constructed some items that were deemed poor, problematic, or inappropriate from the authority’s perspective.
Second, students generally used different strategies when answering different types of questions. However, they seemed to use the strategy of “elimination” very frequently on three types of tests. In terms of the proficiency levels, higher-proficiency students tended to use their vocabulary knowledge, grammar knowledge, and deductive reasoning more frequently than lower-proficiency students in answering the items. On the other hand, lower-proficiency students tended to use the strategy of “guessing” more frequently than higher-proficiency students across three types of questions.
Third, students’ considerations for answering test items clashed with teachers’ test-constructing considerations to a great extent; the overall consistency rate between them was only about 33% in this study. Furthermore, students generally thought in a way more congruent with novice teachers than with experienced teachers. In addition, higher-proficiency students’ considerations clashed more with teachers’ considerations on cloze items while lower-proficiency students’ considerations clashed more with teachers’ considerations on reading comprehension questions.
Afflerbach, P., & Johnston, P. (1984). On the use of verbal reports in reading research. Journal of Reading Behavior, 16(4), 307-322.
Alderson, J. C., & Wall, D. (1993). Does washback exist? Applied Linguistics, 14(2), 115-129.
Anderson, N.J., Bachman, L., Perkins, K., & Cohen, A. (1991). An exploratory study into the construct validity of a reading comprehension test: Triangulation of data sources. Language Testing, 8(1), 41-66.
Arndt, V. (1987). Six writers in search of texts: A protocol-based study of L1 and L2 writing. ELT Journal, 41(4), 257-267.
Bachman, L. F. (1997). Generalizability theory. In C. Clapham, & D. Corson (Eds.), Encyclopedia of language and education, Volume 7: Language testing and assessment (pp. 255-262). Dordrecht: Kluwer Academic.
Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1-42.
Bailey, K. M., & Brown, J. D. (1996). Language testing courses: What are they? In A. Cumming & R. Berwick (Eds.), Validation in language testing (pp. 236-256). Philadelphia, PA: Multilingual Matters.
Banerjee, J. & Luoma, S. (1997). Qualitative approaches to test validation. In C. Clapham & D. Corson (Eds.), Encyclopedia of language and education, Volume 7: Language testing and assessment (pp. 275-287). Dordrecht: Kluwer Academic.
Block, E. (1986). The comprehension strategies of second language readers. TESOL Quarterly, 20(3), 463-494.
Brown, A. (1995). The effect of rater variables in the development of an occupation-specific language performance test. Language Testing, 12(1), 1-15.
Brown, J. D. (Ed.). (1998). New ways of classroom assessment. Alexandria, VA: TESOL.
Brown, J. D., & Bailey, K. M. (2008). Language testing courses: What are they in 2007? Language Testing, 25(3), 349-383.
Buck, G. (1991). The test of listening comprehension: An introspective study. Language Testing, 11(2), 145-170.
Carter, K. (1984). Do teachers understand principles for writing tests? Journal of Teacher Education, 35(6), 57-60.
Chapelle, C. (1988). Field independence: A source of language test variation? Language Testing, 5(1), 62-82.
Cohen, A. D. (1984). On taking language tests: What the students report. Language Testing, 1(1), 70-81.
Cohen, A. D. (1987). Studying learner strategies: How we get the information. In A. Wenden & J. Rubin (Eds.), Learner strategies in language learning (pp. 31-40). Englewood Cliffs, NJ: Prentice Hall.
Cohen, A. D. (1994). Assessing language ability in the classroom (2nd ed.). Boston: Heinle and Heinle.
Cohen, A. D. (1998). Strategies and processes in test taking and SLA. In L.F. Bachman, & A.D. Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp. 90-111). Cambridge: Cambridge University Press.
Cohen, A. D. (2006). The coming of age of research on test-taking strategies. Language Assessment Quarterly, 3(4), 307-331.
Cohen, A. D., & Olshtain, C. (1993). The production of speech acts by EFL learners. TESOL Quarterly, 27(1), 33-56.
Coniam, D. (2009). Investigating the quality of teacher-produced tests for EFL students and the effects of training in test development principles and practices on improving test quality. System, 37(2), 226-242.
Davidson, F., & Lynch, B.K. (2002). Testcraft-A teacher’s guide to writing and using language test specifications. New Haven and London: Yale University Press.
Davies, A. (1997). Demands of being professional in language testing. Language Testing, 14(3), 328-339.
Douglas, D. (2000). Assessing language for specific purposes: Theory and practice. Cambridge: Cambridge University Press.
Douglas, D., & Seliner, L. (1985). Principles for language tests within the ‘discourse domains’ theory of interlanguage: Research, test construction and interpretation. Language Testing, 2(2), 205-226.
Ericsson, K. A., & Simon, H. A. (1984). Protocol analysis: Verbal reports as data. Cambridge, MA: MIT Press.
Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data. (Rev. ed.). Cambridge, MA: MIT Press.
Faerch, C., & Kasper, G. (1987). From product to process-introspective methods in second language research. In C. Faerch & G.. Kasper (Eds.), Introspection in second language research (pp. 5-23). Clevedon: Multilingual Matters Ltd.
Fulcher, G.. (1996). Testing tasks: Issues in task design and the group oral. Language Testing, 13(1), 23-51.
Gass, S., & Mackey, A. (2000). Stimulated recall methodology in second language research. Mahwah, NJ: Lawrence Erlbaum Associates.
Gierl, M. J. (1997). Comparing cognitive representations of test developers and students on a mathematics test with Bloom’s taxonomy. Journal of Educational Research, 91(1), 26-32.
Green, A. (1998). Verbal protocol analysis in language testing research: A handbook. Cambridge: Cambridge University Press.
Gruba, P. & Corbel, C. (1997). Computer-based testing. In C. Clapham, & D. Corson (Eds.), Encyclopedia of language and education. Volume 7: Language testing and assessment (pp. 141-149). Dordrecht: Kluwer Academic.
Hale, G. A. (1988). Student major field and text context: interactive effects on reading comprehension in the Test of English as a Foreign Language. Language Testing, 5(1), 49-61.
Haney, W., & Scott, L. (1987). Talking with children about tests: An exploratory study of test item ambiguity. In R. O. Freedle & R. P. Duran (Eds.), Cognitive and linguistic analyses of test performance (pp. 298-368). Norwood, NJ: Ablex.
Heaton, J. B. (1988). Writing English language tests (New ed.). New York: Longman.
Herman, J. L., Aschbacher, P. R., & Winters, L. (1992). A practical guide to alternative assessment. Alexandria, VA: Association for Supervision and Curriculum Development.
Hill, K. (1993). The effect of test-taker characteristics on reactions to and performance on an oral English proficiency test. In A. J. Kunnan (Ed.), Validation in language assessment (pp. 209-230). Mahwah, NJ: Lawrence Erlbaum.
Hudson, T., Detmer, E., Brown, J. D. (1992). A framework for testing cross-cultural pragmatics. Honolulu: Second Language Teaching and Curriculum Center, University of Hawaii at Manoa.
Hudson, T., Detmer, E., Brown, J. D. (1995). Developing prototypic measures of cross-cultural pragmatics. Honolulu: Second Language Teaching and Curriculum Center, University of Hawaii at Manoa.
Hughes, A. (2003). Testing for language teachers (2nd ed.). New York: Cambridge University Press.
Jafarpur, A. (2003). Is the test constructor a facet? Language Testing, 20(1), 57-87.
Johnson, R., Becker, P., & Olive, F. (1999). Teaching the second-language testing course through test development by teachers-in-training. Teacher Education Quarterly, 26(3), 71-82.
Kirschner, M., Spector-Cohen, E., & Wexler, C. (1996). A teacher education workshop on the construction of EFL tests and materials. TESOL Quarterly, 30(1), 85-111.
Kunnan, A. J. (Ed.). (1998). Special issue: Structural equation modeling. Language Testing, 15(3).
Laufer, B. & Nation, P. (1999). A vocabulary-size test of controlled productive ability. Language Testing, 16(1), 33-51.
Lay, N. D. S. (1982). Composing processes of adult ESL learners: A case study. TESOL Quarterly, 16(3), 406.
Leighton, J. P., Gokiert, R. J., Cor, M. K., & Heffernan, C. (2010). Teachers beliefs about the cognitive diagnostic information of classroom- versus large-scale tests: Implications for assessment literacy. Assessment in Education: Principles, Policy & Practice, 17(1), 7-21.
Lim, G.. S. (2011). The development and maintenance of rating quality in performance writing assessment: A longitudinal study of new and experienced raters. Language Testing, 28(4), 543-560.
Lynch, B. K. (1997). In search of the ethical test. Language Testing, 14(3), 315-327.
Lynch, B. K. & Davidson, F. (1997). Criterion referenced testing. In C. Clapham, & D. Corson (Eds.), Encyclopedia of language and education. Volume 7: Language testing and assessment (pp. 263-273). Dordrecht: Kluwer Academic.
MacKay, R. (1974). Standardized tests: Objectives/objectified measures of “competence.” In A. V. Cicourel et al. (Eds.), Language use and school performance (pp. 218-247). New York: Academic.
McNamara, T. F. (1997). Performance testing. In C. Clapham, & D. Corson (Eds.), Encyclopedia of language and education. Volume 7: Language testing and assessment (pp. 131-139). Dordrecht: Kluwer Academic.
Moghaddam, S. (2010). Cultural schemata: Iranian students’ test-taking processes for cloze tests. Education, Business and Society: Contemporary Middle Eastern Issues, 3(3), 188.
Nation, P. (1990). Teaching and learning vocabulary. Boston, MA: Heinle and Heinle.
Nevo, N. (1989). Test-taking strategies on a multiple-choice test of reading comprehension. Language Testing, 6(2), 199-215.
Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84(3), 231-259.
Norris, J. M., Brown, J. D., Hudson, T., & Yoshioka, J. (1998). Designing second language performance assessments. Honolulu: University of Hawaii at Manoa.
Norris, S. P. (1991). Informal reasoning assessment: Using verbal reports of thinking to improve multiple-choice test validity. In J. F. Voss, D. N. Perkins, & J. W. Segal (Eds.), Informal Reasoning and Education (pp. 451-472). Hillsdale, NJ: Lawrence Erlbaum Associates.
Oller, J. W. (1979). Language tests at school. London: Longman.
Orr, M. (2002). The FCE speaking test: Using rater reports to help interpret test scores. System, 30(2), 143-154.
Park, S. (2009). Verbal report in language testing. The Journal of Kanda University of International Studies, 21, 287-307.
Pienemann, M., Johnson, J., & Brindley, G. (1988). Constructing an acquisition-based procedure for language assessment. Studies in Second Language Acquisition, 10(2), 217-243.
Pollitt, A. (1997). Rasch measurement in latent trait models. In C. Clapham, & D. Corson (Eds.), Encyclopedia of language and education, Volume 7: Language testing and assessment (pp. 243-254). Dordrecht: Kluwer Academic.
Pritchard, R. (1990). The effects of cultural schemata on reading processing strategies. Reading Research Quarterly, 25(4), 273-295.
Raimes, A. (1985). What unskilled ESL students do as they write: A classroom study of composing. TESOL Quarterly, 19(2), 229-258.
Read, J. (2000). Assessing vocabulary knowledge and use. Cambridge: Cambridge University Press.
Riley, G. L., & Lee, J. F. (1996). A comparison of recall and summary protocols as measures of second language reading comprehension. Language Testing, 13(2), 173-190.
Ross, S. (1997). An introspective analysis of listener inferencing on a second language listening test. In G.. Kasper & E. Kellerman (Eds.), Communication strategies: Psycholinguistic and sociolinguistic perspectives. (pp. 216-237). London: Longman.
Rupp, A., Ferne, T., & Choi, H. (2006). How assessing reading comprehension with multiple-choice questions shapes the construct: A cognitive processing perspective. Language Testing, 23(4), 441-474.
Sasaki, M. (2000). Effects of cultural schemata on students’ test-taking processes for cloze tests: A multiple data source approach. Language Testing, 17(1), 85-114.
Shohamy, E. (1997). Testing methods, testing consequences: Are they ethical? Language Testing, 14(3), 340-349.
Stansfield, C. W. (1993). Ethics, standards and professionalism in language testing. Issues in Applied Linguistics, 4(2), 15-30.
Stiggins, R. J. (1991). Assessment literacy. Phi Delta Kappan, March, 534-539.
Storey, P. (1997). Examining the test-taking process: A cognitive perspective on the discourse cloze test. Language Testing, 14(2), 214-231.
Swain, M. (2001). Examining dialogue: another approach to content specification and to validating inferences drawn from test scores. Language Testing, 18(3), 275-302.
Tsai, P. C. (2008). The effects of types of rhetorical tasks, English proficiency, and writing anxiety on senior high school students’ English writing performance. Unpublished master’s thesis, National Taiwan Normal University, Taipei, Taiwan.
Wall, D., & Alderson, J. C. (1993). Examining washback: The Sri Lankan impact study. Language Testing, 10(1), 41-69.
Weigle, S. C. (1994). Effects of training on raters of ESL compositions. Language Testing, 11(2), 197-223.
Weir, C. J. (1990). Communicative language testing. New York: Prentice Hall.
Wiggins, G.. (1993). Assessment: Authenticity, context and validity. Phi Delta Kappan, November, 200-214.
Yamashita, J. (2003). Processes of taking a gap-filling test: Comparison of skilled and less skilled EFL readers. Language Testing, 20(3), 267-293.