A Generalizability Coefficient Comparison of Thai Essay Test Scores for Phathomsuksa 3 Students Using Different Numbers of Raters, Scoring Methods and Scoring Patterns : A Case Study of Kamphaeng Phet Primary Education Service Area Office 2
Main Article Content
Abstract
The purposes of this research were 1) to study the size of variance component compared with scores’ Generalizability Coefficient and 2) to compare deviation’s variance of Thai language essay test scores of grade 3 students with a different number of examiners, checking methods, and checking patterns. The research used the Multi-stage Sampling method to collect samples. The sample was 60 students in grade 3 of the academic year 2020 from schools under the Kamphaeng Phet Primary Educational Service Area Office 2. The tool of the research was a 6-item Thai essay writing test. Moreover, this research’s data was analyzed by Generalizability Coefficient with GENOVA. The results of the research concluded that 1. The variance in all conditions, test takers was the most valuable, followed by test and examiners respectively. 2. Scores’ Generalizability Coefficients with a different number of examiners, checking methods, and checking patterns were found that 2.1) Generalizability Coefficient with different numbers of examiners but same patterns and methods were found that the Generalizability Coefficient differed statistically significant at the 0.01 level except for checking some items of all test takers pattern. 2.2) Generalizability Coefficient with different checking methods but the same number of examiners and checking pattern differed statistically significant at the 0.01 level except for checking some items of all test takers pattern. 2.3) Generalizability Coefficient with different checking patterns but the same number of examiners by discrete elements checking method was found that the Generalizability Coefficient differed statistically significant at the 0.01 level except for one examiner. In part of Generalizability Coefficient with the overall assessment method differed statistically significant at the 0.01level except for 3 numbers of examiners. 3) The variance of test score’s deviation with a different number of examiners, checking methods, and patterns resulting that the deviation’s variance reduced when the number of examiners increased except for checking some items of all test takers pattern.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
บทความที่ได้รับการตีพิมพ์เป็นลิขสิทธิ์ของวารสาร สักทอง : วารสารมนุษยศาสตร์และสังคมศาสตร์ สถาบันวิจัยและพัฒนา มหาวิทยาลับราชภัฏกำแพงเพชร
ข้อคิดเห็นใดๆ ที่ปรากฎในวารสารเป็นวรรณกรรมของผู้เขียนโดยเฉพาะ ซึ่งมหาวิทยาลัยราชภัฏกำแพงเพชรและบรรณาธิการไม่จำเป็นต้องเห็นด้วย
References
Baikularb, A. (2007). An Investigation of Rater Agreement Indexes for Categorized Data. Doctor of Education Degree Srinakharinwirot University, Graduate School, Testing and Measurement.
Brennan, R.L., Gao, X. & Colton, D.A. (1995, April). Generalizability Analysis Work Key Listening and Writing Testing. Educational and Psychological Measurement, 55(2), 157-176.
Grounlund, Norman E. (1976). Measurement and Evaluation in Teaching. (3 rd ed.). New York : Macmillan.
Intanate, N. (2011). Characteristics of the Open-Ended Mathematics Test Scores for Different Numbers of Raters and Scoring Patterns Using Generalizability Model and Many-Facet Rasch Model. Doctor of Education Degree in Testing and Measurement Graduate School, Srinakharinwirot University.
Kanjanawasee, S. (2007). Modern Test Theories. Bangkok : Chulalongkorn University Press.
Kwanja, N. (2013). The Comparison of Generalizability Coefficient of Science Process Skill Test Mathayomsuksa 4 with Different Scoring Pattern. Master of Education Degree, Mahasarakharm University, Faculty of Education, Educational Measurement.
Linacre, John M. & Wright, Benjamin D. (2002). Construction of Measures from Manyfacet Data. Journal of Applied Measurement, 3(4), 486-512.
Ministry of Education. (2001). Basic Education Curriculum 2001. Bangkok : Kurusapa Ladprao Publishing.
Office of the Basic Education Commission, Bureau of Educational Testing. (2019). Student Quality Evaluation Manual of Phatumsuksa 3 Students Academic Year 2019. [Online]. Available : https://bet.obec.go.th./New2020 [2020, March 20].
Peeartit, P. (2002). A study of the optimal number of raters and number of writing tasks using different scoring rubrics. Master of Education Thesis Measurement and Evaluation in Education Graduate School Chulalongkorn University.
Smith, P.L. (1978). Sampling errors of variance components in small sample muitifaceted generalizability studies. Journal of Educational-Statistics, 3(4), 319-346.
Sriwaleerat, W. (2015). A Generalization Coefficient Comparison of Thai Essay Test for Phatumsuksa 6 Students between different Number of Items and Number of Raters. Master of Education Thesis Educational Research and Evaluation, Kanchanaburi Rajabhat University.
Sudweeks, R.R., Reeve, S. & Bradshaw, W.S. (2005). A Comparison of Generalizability Theory and Many-Facet Rash Measurement in an Analysis of College Sophomore Writing. Assessment Writing, 9(3), 239-261.
Swartz, C.W., Hooper, S.R., Montgomery, J.W., Wakely, M.B., De Kruif, R.E.L., Reed, M., Brown, T.T., Levine, M.D., & White, K.P. (1999, June). Using Generalizability Theory to Estimate the Reliability of Writing Scores Derived from Holistic and Analytical Scoring Methods. Education and Psychological Measurement, 59(3), 492-506.
Taoto, J. (2014). A Study of the Reliability of Mathematics Essay Test Score for Matayomsuksa 2 Students: The Different Number of Raters and Scoring Patterns Using Generalizability Theory. Master of Education Thesis Educational Research and Evaluation Pibulsongkram Rajabhat University.
Woodruff, D.J. & Feldt, L.S. (1986, September). Test for Equality of Several Alpha Coefficients When Their Sample Estimates are Dependent. Psychometrika, 51(2), 393-413.