Evaluating the Performance of ChatGPT in Management Information Systems Exam: A Comparative Analysis of ChatGPT-3.5 and 4.0
Main Article Content
บทคัดย่อ
The study aims to assess the accuracy of ChatGPT by evaluating its performance on various types of questions from a Management Information Systems (MIS) exam. The study specifically focuses on comparing the performance of different ChatGPT models, namely ChatGPT-3.5 and ChatGPT-4.0. Researchers retrieved 200 questions from a test bank covering five chapters of an MIS textbook, encompassing multiple-choice, true/false, and essay questions. Forty questions were chosen for each chapter, with each question assigned a level of difficulty. The questions were queried using ChatGPT versions 3.5 and 4.0 via the OpenAI website. The study quantitatively assessed the answers using two query formations: the original question without prompt formation and a modified question with a prompt. Specifically, the ROUGE-L score, a Longest Common Subsequence (LCS) oriented measure, was used to represent the accuracy score for essay answers. The findings of this study revealed that ChatGPT-4.0 outperforms ChatGPT-3.5 in MIS questions with a success rate of 70.58%, compared to 63.42% for ChatGPT-3.5. This aligns with previous studies showing high scores on various exams. However, ChatGPT-4.0 excels in handling higher difficulty levels but performs less effectively at low difficulty levels. The study highlights the potential applications and challenges of using comprehensive language models in educational environments. As Generative AI advances, this study serves as a foundation for future research into the responsible and efficient integration of AI models in various domains. The study also examines issues regarding the potential impact of Generative AI on the credibility of MIS exams and emphasizes the need for further investigation.
Article Details

อนุญาตภายใต้เงื่อนไข Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
เนื้อหาและข้อมูลในบทความที่ลงตีพิมพ์ในวารสารวิชาการมหาวิทยาลัยราชภัฏภูเก็ต ถือเป็นข้อคิดเห็นและความรับผิดชอบของผู้เขียนบทความโดยตรง ซึ่งกองบรรณาธิการวารสารฯ ไม่จำเป็นต้องเห็นด้วยหรือร่วมรับผิดชอบใด ๆ
บทความ ข้อมูล เนื้อหา รูปภาพ ฯลฯ ที่ได้รับการตีพิมพ์ในวารสารวิชาการมหาวิทยาลัยราชภัฏภูเก็ต ถือเป็นลิขสิทธิ์ของวารสารวิชาการมหาวิทยาลัยราชภัฏภูเก็ต หากบุคคลหรือหน่วยงานใดต้องการนำทั้งหมดหรือส่วนหนึ่งส่วนใดไปเผยแพร่ต่อหรือเพื่อกระทำการใด ๆ จะต้องได้รับอนุญาตเป็นลายลักษณ์อักษรจากวารสารวิชาการมหาวิทยาลัยราชภัฏภูเก็ตก่อนเท่านั้น
เอกสารอ้างอิง
AACSB. (2024). Undergraduate. AACSB International. Retrieved April 15, 2024, from https://www.aacsb.edu/learners/journey/undergraduate
AL-Qadri, F. M. I., & Ahmed, A. R. (2023). Artificial intelligence in education: A literature review and research agenda. Research Square. https://doi.org/10.21203/rs.3.rs-2610442/v1
Antaki, F., Touma, S., Milad, D., El-Khoury, J., & Duval, R. (2023). Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of Its Successes and Shortcomings. Ophthalmology Science, 3(4), 100324. https://doi.org/10.1016/j.xops.2023.100324
Bordt, M., & von Luxburg, U. (2023). Evaluating capabilities of foundation models for algorithmic fairness. arXiv Preprint. https://doi.org/10.48550/arXiv.2309.16115
Borji, A. (2023). A categorical archive of ChatGPT failures. arXiv. https://doi.org/10.48550/arXiv.2302.03494
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., & Askell, A. (2020). Language models are few-shot learners. In Advances in Neural Information Processing Systems.
Chauncey, A., & McKenna, L. (2023). ChatGPT and writing pedagogy: Is this the end of writing instruction? College English, 86(3), 187–195.
Cribben, I., & Zeinali, Y. (2023). Artificial intelligence in auditing: The state of the art and future research agenda. SSRN. https://doi.org/10.2139/ssrn.4445017
de Winter, J. C. F. (2023). Can ChatGPT pass high school exams on English language comprehension? International Journal of Artificial Intelligence in Education, 33(3), 675–705. https://doi.org/10.1007/s40593-023-00372-z
Elder, J., Knapp, K., Rose, M., & Belcher, C. (2023, April 1–16). Artificial intelligence in undergraduate education: A review. In IEEE SoutheastCon 2023 (pp. 1–5). Orlando, FL. https://doi.org/10.1109/SoutheastCon48659.2023.10114312
Eulerich, M., Weitz, S., & Werner, N. (2023). Artificial intelligence in internal auditing: A systematic literature review. SSRN. https://doi.org/10.2139/ssrn.4506303
Fergus, T. A., McGrath, P. B., & Malgaroli, M. (2023). An initial evaluation of ChatGPT in generating a treatment plan for social anxiety disorder. Psychology and Psychotherapy: Theory, Research and Practice, 96(4), 1125–1133. https://doi.org/10.1111/papt.12483
Feuerriegel, S., Hartmann, J., Janiesch, C., & Zschech, P. (2024). Generative AI. Business & Information Systems Engineering, 66(1), 111-126. https://doi.org/10.1007/s12599-023-00834-7
Frieder, S., Pinchetti, L., Griffiths, R.-R., Salvatori, T., Lukasiewicz, T., Petersen, P., & Berner, J. (2023). Mathematical capabilities of ChatGPT. In Advances in Neural Information Processing Systems (NeurIPS 2023).
Gilson, A., Safranek, C. W., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2023). How well does ChatGPT do when taking the medical licensing exams? The implications of large language models for medical education and knowledge assessment. JMIR Medical Education, 9, e45312. https://doi.org/10.2196/45312
Hajj, M., & Sah, A. (2023, November 23–25). Can ChatGPT pass accounting exams? In 2023 7th International Conference on Information System and Data Mining (ICISDM) (pp. 25–30). https://doi.org/10.1145/3594612.3594625
Herrmann-Werner, A., Nikendei, C., Greif, R., Roth, S., Hoffmann, G. F., & Zipfel, S. (2024). Artificial intelligence and ChatGPT in medical education: A critical review. GMS Journal for Medical Education, 41(1), Doc03. https://doi.org/10.3205/zma001642
HuggingFace. (2024). Awesome ChatGPT prompts. Retrieved February 20, 2024, from https://huggingface.co/datasets/fka/awesome-chatgpt-prompts
Kocoń, J., Cichecki, I., Kaszyca, O., Kochanek, M., Szydło, D., Baran, J., Bielaniewicz, J., Gruza, M., Janz, A., Kanclerz, K., Kocoń, A., Koptyra, B., Mieleszczenko-Kowszewicz, W., Miłkowski, P., Oleksy, M., Piasecki, M., Radliński, Ł., Wojtasik, K., Woźniak, S., & Kazienko, P. (2023). ChatGPT: Jack of all trades, master of none. Information Fusion, 99, 101861. https://doi.org/10.1016/j.inffus.2023.101861
Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, C., Madriaga, M., Aggabao, R., Diaz-Candido, G., Maningo, J., & Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2(2), e0000198. https://doi.org/10.1371/journal.pdig.0000198
Laudon, K. C., & Laudon, J. P. (2020). Management information systems: Managing the digital firm (16th ed.). Pearson Education Ltd.
Lim, W. M., Gunasekara, A., Pallant, J. L., Pallant, J. I., & Pechenkina, E. (2023). Generative AI and the future of education: Ragnarök or reformation? A paradoxical perspective from management educators. The International Journal of Management Education, 21(2), 100790. https://doi.org/10.1016/j.ijme.2023.100790
Lin, C.-Y. (2004, July). ROUGE: A package for automatic evaluation of summaries. Text Summarization Branches Out, Barcelona, Spain.
Mooney, P., Cui, W., Guan, B., & Juhász, L. (2023). Towards understanding the geospatial skills of ChatGPT: Taking a Geographic Information Systems (GIS) exam. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (AIGKD '23), Hamburg, Germany. https://doi.org/10.1145/3615886.3627745
Ogundare, O., Madasu, S., & Wiggins, N. (2023, 1–3 Nov.). Industrial engineering with large language models: A case study of ChatGPT's performance on oil & gas problems. In 2023 11th International Conference on Control, Mechatronics and Automation (ICCMA).
OpenAI. (2023). ChatGPT (March 14 version) [Large language model]. Retrieved January 10, 2025, from https://chat.openai.com/chat
Pursnani, V., Sermet, Y., Kurt, M., & Demir, I. (2023). Performance of ChatGPT on the US Fundamentals of Engineering exam: Comprehensive assessment of proficiency and potential implications for professional environmental engineering practice. Computers and Education: Artificial Intelligence, 5, 100183. https://doi.org/10.1016/j.caeai.2023.100183
Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121–154. https://doi.org/10.1016/j.iotcps.2023.04.003
Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit Spewer or The End of Traditional Assessments in Higher Education? Journal of applied learning and teaching, 6(1), 342-363. https://doi.org/10.37074/jalt.2023.6.1.9
Russell, S. J., & Norvig, P. (2021). Artificial Intelligence: A Modern Approach. Pearson.
Shahriar, S., & Hayawi, K. (2023). Let’s Have a Chat! A Conversation with ChatGPT: Technology, Applications, and Limitations. Artificial Intelligence and Applications, 2(1), 11-20. https://doi.org/10.47852/bonviewAIA3202939
Shen, X., Chen, Z., Backes, M., & Zhang, Y. (2023). In ChatGPT we trust? Measuring and characterizing the reliability of ChatGPT. arXiv preprint arXiv:2304.08979. https://arxiv.org/abs/2304.08979
Terwiesch, C. (2023). Would Chat GPT3 get a Wharton MBA? A prediction based on its performance in the operations management course. Mack Institute for Innovation Management at the Wharton School, University of Pennsylvania. https://mackinstitute.wharton.upenn.edu/wp-content/uploads/2023/01/Christian-Terwiesch-Chat-GTP.pdf
Vaughan-Nichols, S., & Diaz, M. (2024, April 20). ChatGPT vs. ChatGPT Plus: Is it worth the subscription fee? ZDNet. https://www.zdnet.com/article/chatgpt-vs-chatgpt-plus-is-it-worth-the-subscription-fee/