A Study on Factors and Development of Prediction System by Using Machine Learning for Learning Analytics in Computer Programming of Junior High-school Students

Main Article Content

Wudhijaya Philuek
Nontachai Samngamjan

Abstract

        The objectives of this research were 1) to study factors that affect computer programming learning among middle school students, 2) to study machine learning techniques to analyze factors in learning computer programming, and 3) to develop a system for predicting computer programming learning. The data used was divided into 2 times. The first time was from 411 students to explore exploratory factors and the second time was from a survey of students with total of 1,225 students which used to test machine learning techniques. The primary data analysis method was done by study, analyze documents and 1 round of screening by experts and used for Exploratory Factor Analysis (EFA) and analyzed with 9 machine learning techniques including Naïve Bay (NB), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Logistics Regression (LR), Gradient Boosting (GB), Extreme Gradient Boosting (XGB), and Artificial Neural Network (ANN), then analyze by machine learning techniques to design a prediction system for learning computer programming. Statistics used in the analysis included Mean, Standard Deviation, Correlation Coefficients and Machine Learning Algorithms.


      The research results found that: 


1. Exploratory factor analysis (EFA) and correlation coefficient showed 4 components: 1) Teacher factors, 2) Learning environment, 3) Learning management media factors and 4) Learner factors.


2. Results of the analysis of 9 machine learning techniques for predicting programming learning results from 4 factors. It was found that the Random Forest (RF) technique gave an Accuracy of 96.00 % and Macro F1 of 96.00 %.


3. Design of a computer programming learning prediction system showed a design process such as Use Case, Data Flow Diagram, ER Diagram, Class Diagram and uses the Random Forest (RF) technique in prediction, with program used including Visual Studio Code, Xmapp system usage had menus for 4 user groups.

Article Details

How to Cite
Philuek, W. ., & Samngamjan , N. (2024). A Study on Factors and Development of Prediction System by Using Machine Learning for Learning Analytics in Computer Programming of Junior High-school Students. SOCIAL SCIENCES RESEARCH AND ACADEMIC JOURNAL, 19(2), 43–58. Retrieved from https://so05.tci-thaijo.org/index.php/JSSRA/article/view/270534
Section
Research Articles

References

Abdunabi, R., Hbaci, I, & Ku, H-Y. (2019). Towards Enhancing Programming Self-Efficacy Perceptions Among Undergraduate Information Systems Students. Journal of Information Technology Education: Research, 18, 185206. https://doi.org/10.28945/4308.

Adane, M. D., Deku, J. K. and Asare, E. K. (2023). Performance Analysis of Machine Learning Algorithms in Prediction of Student Academic Performance. Journal of Advances in Mathematics and Computer Science, 38, 74-86. 10.9734/jamcs/2023/v38i51762.

Agrawal, K. and Narain, B. (2022). A Bagging Algorithmic Approach for Prediction of School Student Performance. in 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), 56-59.

Ali, M. B. and Zamzuri, A. T. (2012). Difficulties in Learning Programming: Views of Students. Retrieved March 24, 2023, from https://doi.org/10.13140/2.1.1055.7441.

Akanji, W. et al. (2022). A blind steganalysis-based predictive analytics of numeric image descriptors for digital forensics with Random Forest & SqueezeNet. in 2022 5th Information Technology for Education and Development (ITED), Abuja, Nigeria, pp. 1-7.

Alnoman, A. (2023) Will the Student Get an A Grade? Machine Learning-based Student Performance Prediction in Smart Campus. in 2023 Advances in Science and Engineering Technology International Conferences (ASET), Dubai, United Arab Emirates, 2023, pp. 1-6.

Apriyadi, M. R., Ermatita, E. and Palupi, R.D. (2023). Hyperparameter Optimization of Support Vector Regression Algorithm using Metaheuristic Algorithm for Student Performance Prediction. International Journal of Advanced Computer Science and Applications, 14, 144-150.

Atiwitthayaporn, J. (2011). Knowledge Management for Managing Information Systems Using Databases SQL in Educational Institutions. Thaksin University: Songkhla. (In Thai).

Aunthadet, N. (2010). Development of a Virtual Classroom on the use of Computers in Business According to Constructivist Principles for Undergraduate Students in the Faculty of Business Administration Rajamangala University of Technology Isan. in Academic conference presenting work 23rd National Graduate Research Conference, Rajamangala University of Technology Isan, 23-24 December 2011, pp. 935-940. (In Thai).

Chavez, H., Chavez-Arias, B., Contreras-Rosas, S. and Alvarez-Rodríguez, J.M. (2023). Artificial Neural Network Model to Predict Student Performance Using Nonpersonal Information. Frontiers in Education. https://doi.org/8.10.3389/feduc.2023.1106679.

Chen, Y. H., and Zhai, L. (2022). XGBoost-Based Student Performance Prediction in Tiered Instruction. Fourth International Conference on Computer Science and Educational Informatization (CSEI 2022).

Chenghao, Y. (2022). Research on Student Academic Performance Prediction Methods. Highlights in Science, Engineering and Technology, 24:257-263. https://doi.org/10.54097/hset.v24i.3940

Chrobak, D., Kołodzieczak, M., Kozlovska, P., Krzemińska, A., & Miller, T. (2023). Leveraging Random Forest Techniques for Enhanced Microbiological Analysis: A Machine Learning Approach to Investigating Microbial Communities and Their Interactions. Scientific Collection «InterConf+», (32(151), 386-398.

Giudici, P. (2003). Applied Data Mining: Statistical Methods for Business and Industry. Hoboken: Wiley.

Hong, J., Kim, H., & Hong, H. (2022). Random Forest Analysis of Factors Predicting Science Achievement Groups: Focusing on Science Activities and Learning in School. Asia-Pacific Science Education, 8(2), 424-451.

Jiyoung, M. and Meounggun, J. (2023). Applying Machine Learning-Based Models to Prevent University Student Dropouts. Gyo'yug pyeong'ga yeon'gu, https://doi.org/10.31158/jeev.2023.36.2.289

Kornkao, N. (2015). A Study of the Learning Environment of Sattha Samut School Students under the Jurisdiction of the Secondary Educational Service Area Office Area 10. Master of Education Thesis, Burapha University. (In Thai).

Kotsiantis, S. B. (2007) Supervised Machine Learning: A Review of Classification Techniques. Informatica (Ljubljana), 31: 249-268

Lahtinen, E., Ala-Mutka, K. and Järvinen, H. (2005). A Study of the Difficulties of Novice Programmers. Annual Conference on Innovation and Technology in Computer Science Education. ACM SIGCSE Bulletin. https://doi.org/37. 14-18. 10.1145/1067445.1067453.

Linoff, G. and Berry, M. (2011). Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. (3rd ed.). Hoboken, New Jersey: Wiley.

Mad-adam, U. (2016). Developing Computer Programming Skills Using Skill Training Kits Computer Programming Using Cooperative Learning Methods for Second-Year Software Engineering Students. Walailak University Library. Journal Thaksin University. (In Thai).

Mashagba, E., Al-Saqqar, F. and Al-Shatnawi, A. (2023). Using Gradient Boosting Algorithms in Predicting Student Academic Performance. in 2023 International Conference on Business Analytics for Technology and Security (ICBATS), 1-7.

McDaniel, D.C. (2015). Effectiveness of Integrating Technology Across the Curriculum: Classroom Learning Environments Among Middle-School Students in the USA. Doctoral Dissertation of Curtin University, Western Australia, Australia.

Nadar, N. (2023). Enhancing Student Performance Prediction Through Stream Analysis Dataset Using Modified XGBoost Algorithm. International Journal on Information Technologies and Security, 15, 75-86. https://doi.org/10.59035/KNUG1085.

Naidoo, J. T., Jadhav, A., Sixhaxa, K. and Ajoodha, R. (2023). Using Student Characteristics to Promote Student Success at Higher-education Institutions. In: Kumar, S., Hiranwal, S., Purohit, S.D., Prasad, M. (eds.), Proceedings of International Conference on Communication and Computational Technologies. Algorithms for Intelligent Systems. Singapore: Springer.

Na Thaluatong, et. al. (1997). Guidelines For Developing the Teaching of Computer Subjects in Secondary Schools. Chulalongkorn University, Bangkok. (In Thai).

Nuraeni, F., Agustin, Y.H., Rahayu, S., Kurniadi, D., Septiana, Y. and Lestari, S.M. (2021) Student Study Timeline Prediction Model Using Naïve Bayes Based Forward Selection Feature. in 2021 International Conference on ICT for Smart Society (ICISS), Bandung, Indonesia, 2021, pp. 1-5, https://doi.org/ 10.1109/ICISS53185.2021.9532502

Nurmalitasari, N. and Purwanto, E. (2022). Prediksi Performa Mahasiswa Menggunakan Model Regresi Logistik. Jurnal Derivat: Jurnal Matematika & Pendidikan Matematika, https://doi.org/ 10.31316/jderivat.v9i2.2639

Ojajuni, O. P., Ayeni, F., Akodu, O., Ekanoye, F., Adewole, S., Ayo, T., Misra, S. and Mbarika, V. W. (2021). Predicting Student Academic Performance Using Machine Learning. Computational Science and Its Applications-ICCSA 2021, (pp. 481-491).

Özden, C. and Tezer, M. (2018). The Effect of Coding Teaching on Students’ Self-Efficacy Perceptions of Technology and Design Courses. Sustainability, 10(10), 3822. https://doi.org/10.3390/su10103822

Pande, S. M. (2023). Machine Learning Models for Student Performance Prediction, in 2023 International Conference on Innovative Data Communication Technologies and Application (ICIDCA), Uttarakhand, India, pp. 27-32, https://doi.org/10.1109/ICIDCA56705.2023.10099503.

Pandiangan, N., Lintang, M. and Priyudahari, B. (2022). Naïve Bayes for Analysis of Student Learning Achievement. SHS Web of Conferences. 149, 2022.

Philuek, W., Pongsuk, T., and Panawong, N. (2022). Machine Learning Techniques for Classifying Self-Regulated Learning of Secondary Students in Thailand. Journal of Positive School Psychology, 6(4), 6429-6436.

Philuek, W., Sornchai, S., Raksarikorn, T., and Janyarat, S. (2021). Machine Learning Techniques for the Ordinary National Educational Test (O-NET) Prediction: Case of Small Sized Schools in Nakhon Sawan Province, Thailand. Natural Volatiles and Essential Oils, 8(6), 2821-2833.

Phongsuk, T. (2017). A Study of Data Mining Techniques to Predict Academic Performance and Learning Behavior of Students in the Computer Studies Department at Northern Universities. Undergraduate Thesis (Unpublished) Nakhon Sawan Rajabhat University. (In Thai).

Quinlan, J. R. (1992). C4.5 Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.

Radhoush, S., Vannoy, T., Whitaker, B. M. and Nehrir, H. (2023). Random Forest Meta Learner for Generating Pseudo-Measurements in Active Distribution Power Networks. in 2023 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), (pp. 1-5). Washington, D.C., https://doi.org/10.1109/ISGT51731.2023.10066389.

Safira, B. and Padmannavar, S. S. (2023). Student Performance Analysis using Bayesian Optimized Random Forest Classifier and KNN. International Journal of Engineering Trends and Technology, 71(5), 132-140.

Sagala, T. M. N., Permai, S. D., Gunawan, A. A. S., Barus, R. O. and Meriko, C. (2022) Predicting Computer Science Student's Performance using Logistic Regression, in 2022 5th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, Indonesia, 2022, (pp. 817-821), https://doi.org/10.1109/ISRITI56927.2022.10052968.

Sakdulyatham, R. (2010) Using Data Mining Techniques to Build a Knowledge Base to Predict Academic Achievement of Rajapruek College Students. Rajapruek College, NontaBuri. (In Thai).

Samngamjan, N., Philuek, W., Malangpoo, P. and Janyarat, S. (2021) A Study on Factors and Using Machine Learning for Learning Analytics in Computer Programming of Junior High-School Students. Natural Volatiles and Essential Oils, 8(6), 1224-1233.

Sani, G., Oladipo, F. O., Ogbuju, E. and Agbo, F. (2022). Development of a Predictive Model of Student Attrition Rate. Journal of Applied Artificial Intelligence. 3, 1-12.

Sara, C. (2007). Classification of Graduation Status Groups Using Decision Tree Modeling. Bangkok: King Mongkut's University of Technology North Bangkok.

Sunarko, B. et al. (2022) Prediction of Student Satisfaction with Academic Services Using Naive Bayes Classifier," 2022 6th International Conference on Information Technology, in Information Systems and Electrical Engineering (ICITISEE), Yogyakarta, Indonesia, (pp. 228-233), https://doi.org/10.1109/ICITISEE57756.2022.10057736.

Wang, P. (2022). A Study of Student Performance Under English Teaching Using a Decision Tree Algorithm. Journal of Control and Decision, 10(3), 417-422.

Wang, N., Yao, M. and Li, J. (2022). Research on Student Achievement Analysis Method Based on Decision Tree Algorithm. in 2022 International Symposium on Advances in Informatics, Electronics and Education (ISAIEE), 352-355.