Input as a Key Element in Test Design: A Narrative of Designing an Innovative Critical Thinking Assessment
Main Article Content
Abstract
Test input has often been taken as a given in test design practice. Nearly all guides for test designers provide extensive coverage of how to design test items but pay little attention to test input. This paper presents the case that test input plays a crucial role in designing tests of soft skills that have rarely been assessed in existing tests. In the process of designing a test of critical thinking, several attempts following existing test design guides resulted in poor tests that did not truly assess the intended objectives. These initial attempts used the norm of short passages as test input. Following these failures, we switched to using real-world input, such as tweets, numerical tables, and spam emails. In doing so, it was found that a particular input type favored a particular sub-skill of critical thinking and a particular item type. For example, using tweets as input enabled the assessment of the Perspective Taking sub-skill of critical thinking. This paper concludes that in designing skill tests, integrating appropriate input is at least as important as item design and calls for reevaluating the functions of test input as a distinct and dynamic element.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (AERA, APA & NCME). (2014). Standards for educational and psychological testing. https://www.testingstandards.net/open-access-files.html
Anderson, L. E., & Karthwohl, D. (Eds.). (2001). A taxonomy for learning, teaching and assessment. Longman.
Atkinson, D. (1997). A critical approach to critical thinking in TESOL. TESOL Quarterly, 31(1), 71–94. https://doi.org/10.2307/3587975
Bachman, L., & Palmer, A. (2022). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford University Press.
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice designing and developing useful language tests. Oxford University Press.
Brennan, R. L. (Ed.). (2006). Educational measurement. Praeger Publishers.
Brookhart, S. M., & Nitko, A. J. (2014). Education assessment of students. Merrill Prentice Hall.
Brown, G. T. L., Harris, L. R., & Harnett, J. (2012). Teacher beliefs about feedback within an assessment for learning environment: Endorsement of improved learning over student well-being. Teaching and Teacher Education, 28(7), 968–978. https://doi.org/10.1016/j.tate.2012.05.003
Buranapatana, M. (2006). Enhancing critical thinking of undergraduate Thai students through dialogic inquiry [Doctoral dissertation, University of Canberra]. University of Canberra Research Portal. https://doi.org/10.26191/ayj9-rm66
Butler, H. A. (2012). Halpern critical thinking assessment predicts real-world outcomes of critical thinking. Applied Cognitive Psychology, 25(5), 721–729. https://doi.org/10.1002/acp.2851
Carr, N. T. (2011). Designing and analyzing language tests. Oxford University Press.
Casner-Lotto, J., & Barrington, L. (2006). Are they really ready to work? Employers' perspectives on the basic knowledge and applied skills of new entrants to the 21st century US workforce. Partnership for 21st Century Skills. https://files.eric.ed.gov/fulltext/ED519465.pdf
Cavazos, R. (2019). The economic cost of bad actors on the Internet: Fake news in 2019. University of Baltimore.
https://s3.amazonaws.com/media.mediapost.com/uploads/EconomicCostOfFakeNews.pdf
Chaitrong, W. (2019, October 10). Lack of critical thinking makes Thailand's competitiveness ranking slip. The Nation. https://www.nationthailand.com/
Chong, C. S. J. (2018). Battling biases: How can diverse students overcome test bias on the multistate bar examination. University of Maryland Law Journal of Race, Religion, Gender & Class, 18(1), 31–97. https://digitalcommons.law.umaryland.edu/rrgc/vol18/iss1/19/
Conley, D. T. (2008). Rethinking college readiness. New Directions for Higher Education, (144), 3–13. https://doi.org/10.1002/he.321
Cottrell, S. (2017). Critical thinking skills: Effective analysis, argument and reflection (3rd ed.). Bloomsbury Publishing.
https://doi.org/10.1057/978-1-137-55052-1
Cross, C. (2018). (Mis)Understanding the impact of online fraud: Implications for victim assistance schemes. Victims & Offenders, 13(6), 757-776. https://doi.org/10.1080/15564886.2018.1474154
Cross, C., Richards, K., & Smith, R. (2016). Improving responses to online fraud victims: An examination of reporting and support. Criminal Research Grants. https://www.aic.gov.au/sites/default/files/2020-05/29-1314-FinalReport.pdf
Dagostino, L., Carifio, J., Bauer, J. D., Zhao, Q., & Hashim, N. H. (2014). Assessment of a reading comprehension instrument as it relates to cognitive abilities as defined by Bloom's revised taxonomy. Current Issues in Education, 17(1), 1-12.
Davidshofer, K. R., & Murphy, C. O. (2005). Psychological testing: principles and applications (6th ed.). Pearson.
Dewey, J. (1933). How we think. D.C. Heath and Company. https://archive.org/details/dli.ernet.240488
Dhakal, K. R. (2023). Soft skills for education and work: Developing an innovative test of critical thinking [Unpublished doctoral thesis]. King Mongkut’s University of Technology Thonburi.
Dhakal, K. R., Watson Todd, R., & Jaturapitakkul, N. (2023). Unpacking the nature of critical thinking for educational purposes. Educational Research and Evaluation, 28(4–6), 130–151. https://doi.org/10.1080/13803611.2023.2262447
Djiwandono, P. I. (2006). Cultural bias in language testing. TEFLIN Journal, 17(1), 81-88. http://dx.doi.org/10.15639/teflinjournal.v17i1/85-93
Downing, S. M. (2002). Threats to the validity of locally developed multiple-choice tests in medical education: Construct-irrelevant variance and construct underrepresentation. Advances in Health Sciences Education, 7, 235-241.
Doye, P. (1991). Authenticity in foreign language testing. In S. Anivan (Ed.), Current developments in language testing (pp. 103–110). SEAMEO Regional Language Centre. https://files.eric.ed.gov/fulltext/ED350819.pdf
Education Testing Service (ETS). (2021). Heighten critical thinking sample items. https://www.ets.org/
Ellerton, P., & Kelly, R. (2022). Creativity and critical thinking. In A. Berry, C. Buntting, D. Corrigan, R. Gunstone & A. Jones (Eds.), Education in the 21st century: STEM, creativity and critical thinking (pp. 9–27). Springer International Publishing. https://doi.org/10.1007/978-3-030-85300-6_2
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press. https://doi.org/10.4324/9781410605269
Ennis, S. (1992). The generalizability of critical thinking. Teachers College Press.
Ercikan, K., & Pellegrino, J. W. (2017). Validation of score meaning using examinee response processes for the next generation of assessments. Validation of score meaning for the next generation of assessments (pp. 1-8). Routledge. https://doi.org/10.4324/9781315708591-1
Fisher, A. (2011). Critical thinking: An introduction. Cambridge University Press.
Frey, B. B. (2018). Predictive validity, The SAGE encyclopedia of educational research, measurement, and evaluation. Sage Publications. https://dx.doi.org/10.4135/9781506326139
Galinsky, E. (2010). Mind in the making: The seven essential life skills every child needs. Harper Studio.
Geiger, V., Goos, M., & Forgasz, H. (2015). A rich interpretation of numeracy for the 21st century: A survey of the state of the field. ZDM, 47(4), 531–548. https://doi.org/10.1007/s11858-015-0708-1
Glaser, E. (1942). An experiment in the development of critical thinking. Teachers College Record, 43(5), 409–410.
Gomez, F. (2002). Education as if people matter: A call for critical thinking & humanistic education. Belizean Studies, 24(1), 20–37.
Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge.
Halpern, D. F. (1998). Teaching critical thinking for transfer across domains: Disposition, skills, structure training, and metacognitive monitoring. American Psychologist, 53(4), 449–455. https://doi.org/10.1037/0003-066X.53.4.449
Heckman, J. J., & Kautz, T. (2012). Hard evidence on soft skills. Labour Economics, 19(4), 451-464.
Hora, M. T. (2019). Beyond the skills gap: Preparing college students for life and work. Harvard Education Press.
https://doi.org/10.1080/10668926.2018.1488552
Jain, P., & Rogers, M. (2019). Numeracy as critical thinking. Adults Learning Mathematics, 14(1), 23-33.
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for Fusion Model application to LanguEdge assessment. Language Testing, 26(1), 31–73.
Jolley, D., Mari, S., & Douglas, K. M. (2020). Consequences of conspiracy theories. In B. Michael & K. Peter (Eds.), Routledge handbook of conspiracy theories (pp. 231–241). Routledge. https://doi.org/10.4324/9780429452734-2_7
Keim-Malpass, J., Mitchell, E. M., Sun, E., & Kennedy, C. (2017). Using Twitter to understand public perceptions regarding the# HPV vaccine: Opportunities for public health nurses to engage in social marketing. Public Health Nursing, 34(4), 316-323. https://doi.org/10.1111/phn.12318
Kubiszyn, T., & Borich, G. (2013). Educational testing and measurement (11th ed.). Wiley Publishing.
Lai, E. R. (2011). Critical thinking: A literature review. Pearson's Research Reports, 6(1), 40–41.
Leighton, J. P. (2019). The risk–return trade-off: Performance assessments and cognitive validation of inferences. British Journal of Educational Psychology, 89(3), 441–455.
Leppa, C. J. (1997). Standardized measures of critical thinking: Experience with the California Critical Thinking Tests. Nurse Educator, 22(5), 29–33.
Lewkowicz, J. A. (2000). Authenticity in language testing: some outstanding questions. Language Testing, 17(1), 43–64. https://doi.org/10.1177/026553220001700102
Liu, O. L., Frankel, L., & Roohr, K. C. (2014). Assessing critical thinking in higher education: Current state and directions for next-generation assessment. ETS Research Report Series, 2014(1), 1–23. https://doi.org/10.1002/ets2.12009
Liu, O. L., Mao, L., Frankel, L., & Xu, J. (2016). Assessing critical thinking in higher education: the HEIghten approach and preliminary validity evidence. Assessment & Evaluation in Higher Education, 41(5), 677–694. https://doi.org/10.1080/02602938.2016.1168358
Loo, R., & Thorpe, K. (1999). A psychometric investigation of scores on the Watson-Glaser critical thinking appraisal new form S. Educational and Psychological Measurement, 59(6), 995–1003. https://doi.org/10.1177/00131649921970305
Maclellan, E. (2004). Authenticity in assessment tasks: A heuristic exploration of academics' perceptions. Higher Education Research & Development, 23(1), 19–33. https://doi.org/10.1080/0729436032000168478
Majid, S., Liming, Z., Tong, S., & Raihana, S. (2012). Importance of soft skills for education and career success. International Journal for Cross-Disciplinar y Subjects in Education, 2 (2), 1037–1042. https://doi.org/10.20533/IJCDSE.2042.6364.2012.0147
Mehrens, W. A., & Lehmann, I. J. (1991). Measurement and evaluation in education and psychology (2nd ed.). Houghton Mifflin Company.
Moore, T. (2013). Critical thinking: Seven definitions in search of a concept. Studies in Higher Education, 38(4), 506–522. https://doi.org/10.1080/03075079.2011.586995
Muckle, T. J., Becker, K. A., & Wu, B. (2011, April). Investigating the multiple answer multiple choice item format [Paper presentation]. The Annual Meeting of the National Council on Measurement in Education, New Orleans, LA, United States.
Noori, M., & Mirhosseini, S. A. (2021). Testing language, but what?: Examining the carrier content of IELTS preparation materials from a critical perspective. Language Assessment Quarterly, 18(4), 382–397. https://doi.org/10.1080/15434303.2021.1883618
Office of the National Education Commission (ONEC). (2003). National Education Act B.E. 2542 (1999) and amendments (Second National Education Act B.E. 2545 (2002)). http://www.onesqa.or.th/upload/download/file_697c80087cce7f0f83ce0e2a98205aa3.pdf
Organization for Economic Cooperation and Development (OECD). (2018). The future of education and skills: Education 2030. OECD Education Working Papers.
Pathanasethpong, A. (2014, October 6). Critical thinking? Perish the thought. Bangkok Post. https://www.bangkokpost.com/opinion/opinion/436130/critical-thinking-perish-the-thought
Pillay, H. (2002). Teacher development for quality learning: The Thailand education reform project. Queensland University of Technology.
Ploysangwal, W. (2018). An assessment of critical thinking skills of Thai undergraduate students in private Thai universities in Bangkok through an analytical and critical reading test. University of the Thai Chamber of Commerce Journal Humanities and Social Sciences, 38(3), 75–91.
PricewaterhouseCoopers (PwC). (2022). PwC’s global economic crime and fraud survey 2022. https://www.pwc.com/gx/en/services/forensics/economic-crime-survey.html
Raymond, M. R. (2001). Job analysis and the specification of content for licensure and certification examinations. Applied Measurement in Education, 14(4), 369–415.
Raymond, M. R. (2002). A practical guide to practice analysis for credentialing examinations. Educational Measurement: Issues and Practice, 21(3), 25–37.
Rios, J. A., Ling, G., Pugh, R., Becker, D., & Bacall, A. (2020). Identifying critical 21st-century skills for workplace success: A content analysis of job advertisements. Educational Researcher, 49(2), 80–89. https://doi.org/10.3102/0013189X19890600
Scully, D. (2017). Constructing multiple-choice items to measure higher-order thinking. Practical Assessment, Research, and Evaluation, 22(1), Article 4. https://doi.org/10.7275/swgt-rj52
Sireci, S. G., & Randall, J. (2021). Evolving notions of fairness in testing in the United States. In B. E. Clauser & M. B. Bunch (Eds.), The history of educational measurement: Key advancements in theory, policy, and practice (pp. 111–135). Routledge. https://doi.org/10.4324/9780367815318-6
Solano-Flores, G. (2023). Response: How serious are we about fairness in testing and how far are we willing to go? Educational Assessment, 28(2), 105–117.
Staples, S., Biber, D., & Reppen, R. (2018). Using corpus-based register analysis to explore the authenticity of high-stakes language exams: A register comparison of TOEFL iBT and disciplinary writing tasks. The Modern Language Journal, 102(2), 310–332. https://doi.org/10.1111/modl.12465
Suarta, I. M., Suwintana, I. K., Sudhana, I. F. P., & Hariyanti, N. K. D. (2017). Employability skills required by the 21st century workplace: A literature review of labor market demand. Proceedings of the International Conference on Technology and Vocational Teachers (ICTVT 2017), 102, 337–342. https://doi.org/10.2991/ictvt-17.2017.58
Taylor, R. (1990). Interpretation of the correlation coefficient: A basic review. Journal of Diagnostic Medical Sonography, 6(1), 35–39.
Viruru, R. (2006). Postcolonial technologies of power: Standardized testing and representing diverse young children. International Journal of Educational Policy, Research, and Practice: Reconceptualizing Childhood Studies, 7(1), 49–70.
Watson Todd, R. (1996). Can we test listening authentically? PASAA, 27, 80–86.
Wiggins, G. (1990). The case for authentic assessment. Practical Assessment, Research, and Evaluation, 2(1), 2. https://doi.org/10.7275/ffb1-mm19
Wiggins, G. (1993). Assessment: Authenticity, context, and validity. Phi Delta Kappan, 75(3), 200–213.
World Economic Forum (WEF) (2013). Digital wildfires in a hyperconnected world. Global Risks 2013 (pp. 23–27). https://www3.weforum.org/docs/WEF_GlobalRisks_Report_2013.pdf
Wu, W. M., & Stansfield, C. W. (2001). Towards authenticity of task in test development. Language Testing, 18(2), 187–206. https://doi.org/10.1177/026553220101800205