Online Scam and Social Engineering Comments on YouTube: BERTopic-Based Topic Modeling

Main Article Content

Wasinee Noonpakdee

Abstract

This research aimed to analyze high-frequency keywords in YouTube user comments related to online scams and social engineering in Thailand, identify thematic clusters, and explain the structural patterns of these issues as reflected in digital discourse. Employing a data science approach with text mining techniques, the study analyzed a corpus of 6,244 comments collected from Thai-language YouTube videos published between 2020 and 2025. Data were harvested via the YouTube Data API and processed using Python. The methodology integrated the PyThaiNLP library for Thai word segmentation, the Word2Vec model for semantic enrichment, and the BERTopic framework for advanced topic modeling. The findings indicate a high prevalence of direct experience with or awareness of financial fraud, particularly unauthorized money transfers. Frequently occurring terms centered on financial loss, legal reporting, news monitoring, and negative emotional responses. Furthermore, the results suggest a strong public perception that vulnerable groups, such as older adults, are primary targets of scammers. Topic modeling categorized user comments into five distinct themes: victims’ experiences, systemic and legal criticism, news awareness, emotional engagement, and behavioral reactions toward victims. Overall, online discussions regarding fraud extend beyond mere incident reporting to encompass broader social critique and collective learning among digital audiences.

Article Details

How to Cite
Noonpakdee, W. (2026). Online Scam and Social Engineering Comments on YouTube: BERTopic-Based Topic Modeling. Rajapark Journal, 20(66), 1–20. retrieved from https://so05.tci-thaijo.org/index.php/RJPJ/article/view/286882
Section
Research Article

References

Abdullah, A. O., Ali, M. A., Karabatak, M., & Sengur, A. (2018). A comparative analysis of common YouTube comment spam filtering techniques. In 2018 6th International Symposium on Digital Forensic and Security (ISDFS) (pp. 1-5), Antalya, Turkey. DOI: 10.1109/ISDFS.2018.8355315

Abuzayed, A., & Al-Khalifa, H. (2021). BERT for Arabic topic modeling: An experimental study on BERTopic technique. Procedia Computer Science, 189, 191-194. https://doi.org/10.1016/j.procs.2021.05.096

Alomair, M., Issa, T., Zaung Nau, S., & Abu Salih, B. (2025). The key factors that influence employees’ awareness of social engineering: A systematic literature review. Heliyon, 11(16), Article e44012. https://doi.org/10.1016/j.heliyon.2025.e44012

An, Y., Kim, D., Lee, J., Oh, H., Lee, J. S., & Jeong, D. (2023). Topic modeling-based framework for extracting marketing information from e-commerce reviews. IEEE Access, 11, 135049-135060. https://doi.org/10.1109/ACCESS.2023.3337808

Azzahra, S. F., Rahmani, D. A., Astriani, T., Lubis, M., & Hadi, R. M. E. (2023). The role of social media in knowledge management: A comprehensive literature review. In 2023 11th International Conference on Cyber and IT Service Management (CITSM) (pp. 1-5), Makassar, Indonesia. DOI: 10.1109/CITSM60085.2023.10455481

Bishnoi, A., Garv, Bishnoi, S., & Gupta, N. (2023). Comprehensive assessment of reverse social engineering to understand social engineering attacks. In 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT) (pp. 681-685), Tirunelveli, India. DOI: 10.1109/ICSSIT55814.2023.10061054.

Electronic Transactions Development Agency (ETDA). (2025). ETDA reports over 35,000 online complaint cases in 2024, highlighting online shopping scams and illegal websites as the most reported issues. https://www.etda.or.th/th/pr-news/etda_stat_online_fraund.aspx

Forbes Thailand. (2025). YouTube remains the most used application among Thai users, increasingly viewed as a digital television platform. https://www.forbesthailand.com/news/it/thailand-s-2025-tech-ranking-mobile-nation

Hill, C., Irshaidat, F., Johnson, M., & Fresneda, J. (2025). An analytical assessment of sentiment analysis trends and methods through systematic review and topic modeling. Decision Analytics Journal, 17, Article 100644. https://doi.org/10.1016/j.dajour.2025.100644

Hussain, M. N., Tokdemir, S., Agarwal, N., & Al-Khateeb, S. (2018). Analyzing disinformation and crowd manipulation tactics on YouTube. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 1092-1095), Barcelona, Spain. DOI: 10.1109/ASONAM.2018.8508766

Liu, J., Long, R., Chen, H., Wu, M., Ma, W., & Li, Q. (2024). Topic-sentiment analysis of citizen environmental complaints in China: Using a Stacking-BERT model. Journal of Environmental Management, 371, Article 123112. https://doi.org/10.1016/j.jenvman.2024.123112

Maher, C. A., & Engle, T. A. (2024). Knowing is half the battle: Examining the association between acknowledgement of victimization and reporting of fraud. Journal of Economic Criminology, 5, 100092. https://doi.org/10.1016/j.jeconc.2024.100092

Mouncey, E., & Ciobotaru, S. (2025). Phishing scams on social media: An evaluation of cyber awareness education on impact and effectiveness. Journal of Economic Criminology, 7, Article 100125. https://doi.org/10.1016/j.jeconc.2025.100125

Mouton, F., Malan, M. M., Leenen, L., & Venter, H. S. (2014). Social engineering attack framework. In 2014 Information Security for South Africa (pp. 1-9), Johannesburg, South Africa. DOI: 10.1109/ISSA.2014.6950510

Nataraj-Hansen, S. (2024). “More intelligent, less emotive and more greedy”: Hierarchies of blame in online fraud. International Journal of Law, Crime and Justice, 76, Article 100652. https://doi.org/10.1016/j.ijlcj.2024.100652

Oh, J., & Kim, J. (2025). YouTube as a social listening tool: Mining housing discourse on YouTube. Cities, 166, Article 106279. https://doi.org/10.1016/j.cities.2025.106279

Olivia, T., Halim, E., & Saputra, L. S. (2025). Digital ethics and the overrated phenomenon: Balancing freedom and responsibility in social media. In 2025 13th International Conference on Cyber and IT Service Management (CITSM) (pp. 1-6), Jakarta, Indonesia.

DOI 10.1109/CITSM67730.2025.11291517

Rachamadugu, S. K., Pushphavathi, T. P., Khan, S. B., & Alojail, M. (2024). Exploring topic coherence with PCC-LDA and BERT for contextual word generation. IEEE Access, 12, 175252–175267. https://doi.org/10.1109/ACCESS.2024.3477992

Rathod, T., Jadav, N. K., Tanwar, S., Alabdulatif, A., Garg, D., & Singh, A. (2025). A comprehensive survey on social engineering attacks, countermeasures, case study, and research challenges. Information Processing & Management, 62(1), Article 103928. https://doi.org/10.1016/j.ipm.2024.103928

Sethia, K., Saxena, M., Goyal, M., & Yadav, R. K. (2022). Framework for topic modeling using BERT, LDA and K-Means. In 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE) (pp. 2204-2208). Greater Noida, India. DOI: 10.1109/ICACITE53722.2022.9823442

Triantafyllopoulos, A., Spiesberger, A. A., Tsangko, I., Jing, X., Distler, V., Dietz, F., Alt, F., & Schuller, B. W. (2025). Vishing: Detecting social engineering in spoken communication — A first survey & urgent roadmap to address an emerging societal challenge. Computer Speech & Language, 94, Article 101802. https://doi.org/10.1016/j.csl.2025.101802

Tsinganos, N., Mavridis, I., & Gritzalis, D. (2022). Utilizing convolutional neural networks and word embeddings for early-stage recognition of persuasion in chat-based social engineering attacks. IEEE Access, 10, 108517–108529. https://doi.org/10.1109/ACCESS.2022.3213681

Twardawski, M., Fischer, M., Agostini, P., Schwabe, J., & Gollwitzer, M. (2025). The role of just-world beliefs, victim identifiability, and the salience of an alternative target for victim blaming. Journal of Experimental Social Psychology, 119, Article 104721. https://doi.org/10.1016/j.jesp.2025.104721

Waelchli, S., & Walter, Y. (2025). Reducing the risk of social engineering attacks using SOAR measures in a real world environment: A case study. Computers & Security, 148, Article 104137. https://doi.org/10.1016/j.cose.2024.104137

Yao, W., Tang, M., Liu, Z., & Ni, M. (2025). Social capital and financial fraud among the elderly. International Review of Financial Analysis, 101, Article 104035. https://doi.org/10.1016/j.irfa.2025.104035