Language Assessment at a Thai University: A CEFR-Based Test of English Proficiency Development

Main Article Content

Budi Waluyo
Ali Zahabi
Luksika Ruangsung

Abstract

The increasing popularity of the Common European Framework of Reference (CEFR) in non-native English-speaking countries has generated a demand for concrete examples in the creation of CEFR-based tests that assess the four main English skills. In response, this research endeavors to provide insight into the development and validation of a CEFR-based test aimed at evaluating undergraduate students’ English proficiency for placement tests and exit exams. The CEFR served as the framework for item development while Classical Test Theory informed the test evaluation process. A sample of 2,248 first-year students participated in Testing 1 and 3,655 first- and second-year students took part in Testing 2. The results of the analysis of the multiple-choice listening and reading tests indicated favorable levels of item difficulty and discrimination indices, as well as high reliability coefficients obtained from Cronbach’s alpha, Kuder-Richardson, and split-half reliability. The correlation and regression analyses revealed close relationships between the subtests and between each subtest and the total score, supporting the test’s criterion validity. The study also demonstrated significant predictive validity on TOEIC scores. The findings of this study offer implications for the development of university-level English proficiency tests that integrate CEFR levels and CTT analysis.

Article Details

How to Cite
Waluyo, B., Zahabi, A., & Ruangsung, L. (2024). Language Assessment at a Thai University: A CEFR-Based Test of English Proficiency Development. REFLections, 31(1), 25–47. https://doi.org/10.61508/refl.v31i1.270418
Section
Research articles

References

Alagumalai, S., & Curtis, D. D. (2005). Classical test theory. In R. Maclean, R. Watanabe, R. Baker, Boediono, Y. C. Cheng, W. Duncan, J. Keeves, Z. Mansheng, C Power, J. S. Rajput, K. H. Thaman, S. Alagumalai, D. D. Curtis & N. Hungi (Eds.), Applied Rasch measurement: A book of exemplars (pp. 1-14). Springer.

Anantapol, W., Keeratikorntanayod, W., & Chobphon, P. (2018). Developing English proficiency standards for English language teachers in Thailand. Humanities Journal, 25(2), 1-35.

Borger, L. (2019). Assessing interactional skills in a paired speaking test: Raters’ interpretation of the construct. Apples: Journal of Applied Language Studies, 13(1), 151-174.

Brunfaut, T., & Harding, L. (2014). Linking the GEPT listening test to the Common European Framework of Reference. LTTC-GEPT Research Reports RG-05, 1-75.

Cheewasukthaworn, K. (2022). Developing a standardized English Proficiency Test in alignment with the CEFR. PASAA: Journal of Language Teaching and Learning in Thailand, 63, 66-92.

Cheng, L., Klinger, D., Fox, J., Doe, C., Jin, Y., & Wu, J. (2014). Motivation and test anxiety in test performance across three testing contexts: The CAEL, CET, and GEPT. TESOL Quarterly, 48(2), 300-330.

Council of Europe. (2001). The Common European Framework of Reference for languages: Learning, teaching, assessment. Cambridge University Press.

Council of Europe. (2018). The common European framework of reference for languages: Learning, teaching, assessment. Companion volume with new descriptors. Council of Europe.

Dashti, L., & Razmjoo, S. A. (2020). An examination of IELTS candidates’ performances at different band scores of the speaking test: A quantitative and qualitative analysis. Cogent Education, 7(1), 1770936.

DeMars, C. E. (2018). Classical test theory and item response theory. The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development. https://www.researchgate.net/publication/323222170_Classical_Test_Theory_and_Item_Response_Theory

DeVellis, R. F. (2006). Classical Test Theory. Medical Care, 44(11), S50-S59.

Deygers, B., Zeidler, B., Vilcu, D., & Carlsen, C. H. (2018). One framework to unite them all? Use of the CEFR in European university entrance policies. Language Assessment Quarterly, 15(1), 3-15.

Dimova, S., Yan, X., & Ginther, A. (2022). Local tests, local contexts. Language Testing, 39(3), 341-354.

Dunlea, J., Fouts, T., Joyce, D., & Nakamura, K. (2019). EIKEN and TEAP: How two test systems in Japan have responded to different local needs in the same context. In L. I. W. Su, C. J. Weir & J. R. W. Wu (Eds.), English language proficiency testing in Asia (pp. 131-161). Routledge.

Figueras, N. (2012). The impact of the CEFR. ELT Journal, 66(4), 477-485.

Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge.

Hambleton, R. K. (2000). Emergence of item response modeling in instrument development and data analysis. Medical Care, 38, 60-65.

Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Instructional Topics in Educational Measurement, 12(3), 38-47.

Harsch, C. (2018). How suitable is the CEFR for setting university entrance standards?. Language Assessment Quarterly, 15(1), 102-108.

Harsch, C., & Seyferth, S. (2020). Marrying achievement with proficiency–Developing and validating a local CEFR-based writing checklist. Assessing Writing, 43, 1-15.

Hiranburana, K., Subphadoongchone, P., Tangkiengsirisin, S., Phoochaeoensil, S., Gainey, J., Thogsngsri, J., ... & Taylor, P. (2017). A Framework of Reference for English Language Education in Thailand (FRELE-TH)-Based on the CEFR, the Thai experience. LEARN Journal: Language Education and Acquisition Research Network, 10(2), 90-119.

Impara, J. C., & Plake, B. S. (1998). Teachers’ ability to estimate item difficulty: A test of the assumptions in the Angoff standard setting method. Journal of Educational Measurement, 35, 69-81.

Irwing, P., & Hughes, D. J. (2018). Test development. The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development. https://onlinelibrary.wiley.com/doi/10.1002/9781118489772.ch1

Janssen, G., Meier, V., & Trace, J. (2014). Classical test theory and item response theory: Two understandings of one high-stakes performance exam. Colombian Applied Linguistics Journal, 16(2), 167-184.

Kanchai, T. (2019). Thai EFL university lecturers’ viewpoints towards impacts of the CEFR on their English language curricula and teaching practice. NIDA Journal of Language and Communication, 24(35), 23-47.

Khalifa, H., & Ffrench, A. (2009). Aligning Cambridge ESOL examinations to the CEFR: Issues & practice. Cambridge ESOL Research Notes, 37, 10-14.

Kim, E. Y. J. (2018). Utility and bias in a Korean standardized test of English: The case of i-TEPS (Test of English Proficiency developed by Seoul National University). Asian Englishes, 1-14.

Kim, M., & Crossley, S. A. (2020). Exploring the construct validity of the ECCE: Latent structure of a CEFR-based high-intermediate level English language proficiency test. Language Assessment Quarterly, 17(4), 434-457.

Little, D. (2006). The Common European Framework of Reference for languages: Content, purpose, origin, reception and impact. Language Teaching, 39, 167–190.

Liu, L., & Jia, G. (2017). Looking beyond scores: Validating a CEFR-based university speaking assessment in Mainland China. Language Testing in Asia, 7(1), 1-16.

Magno, C. (2009). Demonstrating the difference between classical test theory and item response theory using derived test data. The International Journal of Educational and Psychological Assessment, 1(1), 1-11.

Malec, W., & Krzeminska-Adamek, M. (2020). A practical comparison of selected methods of evaluating multiple-choice options through classical item analysis. Practical Assessment, Research, and Evaluation, 25(1), 7.

Moser, J. (2015). From a knowledge-based language curriculum to a competency-based one: The CEFR in action in Asia. Asian EFL Journal, 88, 1-29.

Nagai, N. (2020). CEFR-informed learning, teaching and assessment: A practical guide. Springer Nature.

Nagai, N., Birch, G. C., Bower, J. V., & Schmidt, M. G. (2020). The CEFR and practical resources. Springer.

Negishi, M., Takada, T., & Tono, Y. (2013, January). A progress report on the development of the CEFR-J. In E. D. Galaczi & C. J. Weir (Eds.), Exploring language frameworks: Proceedings of the ALTE Kraków Conference (pp. 135-163). Cambridge University Press.

Nguyen, A. T. (2016). Towards an examiners training model for standardized oral assessment qualities in Vietnam. Malaysian Journal of ELT Research, 11(1), 41-51.

North, B. (2014). The CEFR in practice (Vol. 4). Cambridge University Press.

North, B., Ortega, Á., & Sheehan, S. (2010). British Council–EAQUALS core inventory for general English. British Council/EAQUALS. https://www.eaquals.org/wp-content/uploads/EAQUALS_British_Council_Core_Curriculum_April2011.pdf

O’Sullivan, B., & Dunlea, J. (2015). Aptis general technical manual version 1.0. British Council.

Panmei, B., & Waluyo, B. (2022). The pedagogical use of gamification in English vocabulary training and learning in higher education. Education Sciences, 13(1), 1-22.

Papageorgiou, S., Tannenbaum, R. J., Bridgeman, B., & Cho, Y. (2015). The association between TOEFL iBT® test scores and the Common European Framework of Reference (CEFR) levels. https://www.ets.org/Media/Research/pdf/RM-15-06.pdf

Perkins, K., & Miller, L. D. (1984). Comparative analyses of English as a second language reading comprehension data: Classical test theory and latent trait measurement. Language Testing, 1(1), 21-32.

Piccardo, E. (2020). The Common European Framework of Reference (CEFR) in language education: Past, present, and future. TIRF: Language Education in Review Series, 15, 1-13.

Pratiwi, D. I., & Waluyo, B. (2022). Integrating task and game-based learning into an online TOEFL preparatory course during the COVID-19 outbreak at two Indonesian higher education institutions. Malaysian Journal of Learning and Instruction (MJLI), 19(2), 37-67.

Quynh, N. T. N. (2019). Vietnamese standardized test of English proficiency: A panorama. In L. I. W. Su, C. J. Weir & J. R. W. Wu (Eds.), English language proficiency testing in Asia (pp. 71-100). Routledge.

Rezaei, A., & Shabani, E. A. (2010). Gender differential item functioning analysis of the University of Tehran English Proficiency Test. Pazhuhesh-e Zabanha-ye Khareji, 56, 89-108.

Rofiah, N. L., & Waluyo, B. (2020). Using Socrative for vocabulary tests: Thai EFL learner acceptance and perceived risk of cheating. The Journal of AsiaTEFL, 17(3), 966-982.

Rofiah, N. L., Sha’ar, M. Y. M. A., & Waluyo, B. (2022). Digital divide and factors affecting English synchronous learning during Covid-19 in Thailand. International Journal of Instruction, 15(1), 633-652.

Schoepp, K. (2018). Predictive validity of the IELTS in an English as a medium of instruction environment. Higher Education Quarterly, 72(4), 271-285.

Stage, C. (2003). Classical Test Theory or Item Response Theory: The Swedish experience. www.cepchile.cl

Suen, H. K. (1990). Principles of test theories. Routledge.

Tannenbaum, R. J., & Wylie, E. C. (2008). Linking English-language test scores onto the Common European Framework of Reference: An application of standard-setting methodology. ETS Research Report Series, 2008(1), 1-75.

Templer, B. (2004). High-stakes testing at high fees: Notes and queries on the international English proficiency assessment market. Journal for Critical Education Policy Studies, 2(1), 1-8.

Thirakunkovit, S. (2016). An evaluation of a post-entry test: An item analysis using Classical Test Theory (CTT) [Doctoral dissertation, Purdue University]. https://docs.lib.purdue.edu/open_access_dissertations?utm_source=docs.lib.purdue.edu%2Fopen_access_dissertations%2F862&utm_medium=PDF&utm_campaign=PDFCoverPages

Waluyo, B. (2020). Thai EFL learners’ WTC in English: Effects of ICT support, learning orientation, and cultural perception. Humanities, Arts and Social Sciences Studies, 20(2), 477-514.

Waluyo, B. (2019). Examining Thai first-year university students’ English proficiency on CEFR levels. The New English Teacher, 13(2), 51-71.

Waluyo, B., & Apridayani, A. (2021). Teachers’ beliefs and classroom practices on the use of video in English language teaching. Studies in English Language and Education, 8(2), 726-744.

Waluyo, B., & Bakoko, R. (2021). Vocabulary list learning supported by gamification: Classroom action research using Quizlet. Journal of Asia TEFL, 18(1), 289-299.

Weir, C. J. (2005). Limitations of the Common European Framework for developing comparable examinations and tests. Language Testing, 22(3), 281-300.

Wu, M., Tam, H. P., & Jen, T.-H. (2016). Classical Test Theory. Educational Measurement for Applied Researchers, 73–90.

Wu, J. R., & Wu, R. Y. (2007). Using the CEFR in Taiwan: The perspective of a local examination board. The Language Training and Testing Center Annual Report, 56, 1-20.

Wu, R. Y. F. (2019). The general English Proficiency Test in Taiwan: Past, present, and future. In L. I. W. Su, C. J. Weir & J. R. W. Wu (Eds.), English language proficiency testing in Asia (pp. 9-41). Routledge.

Wudthayagorn, J. (2018). Mapping the CU-TEP to the Common European Framework of Reference (CEFR). LEARN Journal: Language Education and Acquisition Research Network, 11(2), 163-180.

Zubairi, A. M., & Kassim, N. L. A. (2016). Classical and Rasch analyses of dichotomously scored reading comprehension test items. Malaysian Journal of ELT Research, 2(1), 1-20.