A Rasch-based Validation of the Reading Section of Test of English for Thai Engineers and Technologists (TETET)
Main Article Content
Abstract
This study examined the validity of the reading section of the English for Specific Purposes (ESP) test, the Test of English for Thai Engineers and Technologists (TETET), using the Rasch model in relation to Messick’s (1989, 1995) six aspects of validity. Data were collected from 179 participants at a science and technology university in Bangkok, Thailand, and analyzed with WINSTEPS Rasch software version 5.4.1. The results indicated that the reading section generally met Messick’s criteria, with strong evidence for content, substantive, and structural validity, and preliminary evidence for external, generalizability, and consequential validity. To distinguish it from general English tests, some items, particularly in the survival and internet reading sections, required revision to enhance content (representativeness and technical quality), and substantive aspects. These findings not only provide practical guidance for refining TETET and similar ESP assessments, but also highlight both the strengths and the limits of Rasch analysis in addressing a comprehensive validity framework. Future research should therefore complement psychometric findings with qualitative evidence to build a more complete validity argument.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Alderson, J. C., & Wall, D. (1993). Does language testing work? A study of the validity of language tests. Language Testing, 10(1), 1–24.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (AERA, APA & NCME). (2014). Standards for educational and psychological testing. https://www.testingstandards.net/open-access-files.html
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press.
Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford University Press.
Baghaei, P., & Amrahi, N. (2011). The effects of the number of options on the psychometric characteristics of multiple-choice items. Journal of Language Teaching and Research, 2(5), 1052–1060.
Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the Human Sciences (3rd ed.). Routledge.
Boone, W. J. (2016). Rasch analysis for instrument development: Why, when, and how? CBE Life Sciences Education, 15(4), 1–7. https://doi.org/10.1187/cbe.16-04-0148
Chapelle, C. A. (2012). Validity argument in language testing: Case studies of validation research. Cambridge University Press.
Chapelle, C. A., Enright, M. K., & Jamieson, J. (2010). Building a validity argument for the test of English as a foreign language. Routledge.
Chapelle, C. A., & Voss, E. (2021). Validity argument in language testing: Case studies of validation research. Cambridge University Press.
Chong, J., Mokshein, S. E., & Mustapha, R. (2022). Instrument validation based on the six aspects of Messick validity framework using the Rasch rating scale model. Jurnal Penyelidikan Pendidikan, 23(1), 117–138.
Dhakal, K. R., Watson Todd, R., & Jaturapitakkul, N. (2023). Unpacking the nature of critical thinking for educational purposes. Educational Research and Evaluation, 28(4-6), 130–151.
Dhakal, K. R., Watson Todd, R., & Jaturapitakkul, N. (2024). Input as a key element in test design: A narrative of designing an innovative critical thinking assessment. rEFLections, 31(2), 516–542.
Engelhard Jr., G., & Wind, S. A. (2017). Invariant measurement with raters and rating scales: Rasch models for rater-mediated assessments. Routledge.
ETS (2019). Understanding TOEIC test scores. https://www.etsglobal.org/fr/en/content/understanding-toeictests-scores
ETS (2023). TOEIC 2023 report on test takers worldwide. https://www.ets.org/pdfs/toeic/toeic-listening-readingreport-test-takers-worldwide.pdf
Ha, H. T. (2021). A Rasch-based validation of the Vietnamese version of the listening vocabulary levels test. Language Testing in Asia, 11, Article 16. https://doi.org/10.1186/s40468-021-00132-7
Jaturapitakkul, N. (2007), The effects of language ability and engineering background knowledge on ESP reading ability of Thai graduate students, their test taking strategies and attitudes towards the test [Unpublished doctoral dissertation]. Chulalongkorn University.
Jaturapitakkul, N., & Watson Todd, R. (2018). TETET technical report. https://sola.kmutt.ac.th/tetet/doc/TETET_Technical_report_NJ_RT_Complete.pdf
Kane, M. T. (2006). Validation. In R. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). American Council on Education and Praeger.
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.
Kangwanrattanakul, K. (2025). Validation of the Thai World Health Organization quality of life-OLD (WHOQOL-OLD) among Thai older adults: Rasch analysis. Scientific Reports, 15, Article 12978. https://doi.org/10.1038/s41598-025-97824-4
Lane, S., Raymond, M. R., & Haladyna, T. M. (Eds.). (2016). Handbook of test development (2nd ed.). Routledge.
Linacre, J. (n.d.). Help for facets Rasch measurement and Rasch analysis software. Winsteps. www.winsteps.com
Linacre, J. (1994). Sample size and item calibration stability. Rasch Measurement Transactions, 7(4), 328.
Linacre, J. (2002) What do infit, outfit, mean-square, and standardization mean? Archives of Rasch Measurement, 16, 871–882.
Linacre, J. M. (2005). A user’s guide to Winsteps/Ministeps Rasch model programs. MESA Press.
Linacre, J. M. (2022). Winsteps® Rasch measurement computer program user’s guide. Version 5.2.3. Winsteps.
Maneekhao, K., Jaturapitakkul, N., Watson Todd, R., & Tepsuriwong, S. (2006). Developing an innovative computer-based test. Prospect-Adelaide, 21(2), 34–46.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). American Council on Education and Macmillan.
Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.
Messick, S. (1995). Standards of validity and the validity of standards in performance assessment. Educational Measurement: Issues and Practice, 14(4), 5–8.
Noroozi, S., & Karami, H. (2024). A Rasch-based validation of the University of Tehran English Proficiency Test (UTEPT). Language Testing in Asia, 14(2024), 1–18.
Patarapichayatham, C., Kamata, A., & Kanjanawasee, S. (2009). Cross-level two-way differential item functioning analysis by multilevel Rasch modeling. Journal of Research Methodology & Cognitive Science, 7(1), 15–21.
Raiche, G. (2005). Critical eigenvalue sizes in standardized residual principal components analysis. Rasch Measurement Transactions, 19(1), Article 1012.
Ravand, H., & Firoozi, T. (2016). Examining construct validity of the Master’s UEE using the Rasch model and the six aspects of Messick’s framework. International Journal of Language Testing, 6(1), 1–23.
Shaw, S., & Crisp, V. (2011). Tracing the evolution of validity in educational measurement: Past issues and contemporary challenges. Assuring in Assessment, 11, 14–19.
Shohamy, E. G., Or, I. G., & May, S. (Eds.). (2017). Language testing and assessment (3rd ed.). Springer. https://doi.org/10.1007/978-3-319-02261-1
Thepsathit, P., & Tangdhanakanond, K. (2024). The development of formative assessment rubrics for enhancing students’ performance on Thai percussion instruments. International Journal of Music Education, 42(4), 674–690. https://doi.org/10.1177/02557614231192189
Weir, C. (2005). Language testing and validation: An evidence-based approach. Palgrave Macmillan.
Wolfe, E. W., & Smith, E. V. (2007). Instrument development tools and activities for measure validation using Rasch models. Journal of Applied Measurement, 8(3), 297–318.
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis: Rasch measurement. MESA Press.