Online Scam and Social Engineering Comments on YouTube: BERTopic-Based Topic Modeling
Main Article Content
Abstract
This research aimed to analyze high-frequency keywords in YouTube user comments related to online scams and social engineering in Thailand, identify thematic clusters, and explain the structural patterns of these issues as reflected in digital discourse. Employing a data science approach with text mining techniques, the study analyzed a corpus of 6,244 comments collected from Thai-language YouTube videos published between 2020 and 2025. Data were harvested via the YouTube Data API and processed using Python. The methodology integrated the PyThaiNLP library for Thai word segmentation, the Word2Vec model for semantic enrichment, and the BERTopic framework for advanced topic modeling. The findings indicate a high prevalence of direct experience with or awareness of financial fraud, particularly unauthorized money transfers. Frequently occurring terms centered on financial loss, legal reporting, news monitoring, and negative emotional responses. Furthermore, the results suggest a strong public perception that vulnerable groups, such as older adults, are primary targets of scammers. Topic modeling categorized user comments into five distinct themes: victims’ experiences, systemic and legal criticism, news awareness, emotional engagement, and behavioral reactions toward victims. Overall, online discussions regarding fraud extend beyond mere incident reporting to encompass broader social critique and collective learning among digital audiences.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Views and opinions appearing in the Journal it is the responsibility of the author of the article, and does not constitute the view and responsibility of the editorial team.
References
Abdullah, A. O., Ali, M. A., Karabatak, M., & Sengur, A. (2018). A comparative analysis of common YouTube comment spam filtering techniques. In 2018 6th International Symposium on Digital Forensic and Security (ISDFS) (pp. 1-5), Antalya, Turkey. DOI: 10.1109/ISDFS.2018.8355315
Abuzayed, A., & Al-Khalifa, H. (2021). BERT for Arabic topic modeling: An experimental study on BERTopic technique. Procedia Computer Science, 189, 191-194. https://doi.org/10.1016/j.procs.2021.05.096
Alomair, M., Issa, T., Zaung Nau, S., & Abu Salih, B. (2025). The key factors that influence employees’ awareness of social engineering: A systematic literature review. Heliyon, 11(16), Article e44012. https://doi.org/10.1016/j.heliyon.2025.e44012
An, Y., Kim, D., Lee, J., Oh, H., Lee, J. S., & Jeong, D. (2023). Topic modeling-based framework for extracting marketing information from e-commerce reviews. IEEE Access, 11, 135049-135060. https://doi.org/10.1109/ACCESS.2023.3337808
Azzahra, S. F., Rahmani, D. A., Astriani, T., Lubis, M., & Hadi, R. M. E. (2023). The role of social media in knowledge management: A comprehensive literature review. In 2023 11th International Conference on Cyber and IT Service Management (CITSM) (pp. 1-5), Makassar, Indonesia. DOI: 10.1109/CITSM60085.2023.10455481
Bishnoi, A., Garv, Bishnoi, S., & Gupta, N. (2023). Comprehensive assessment of reverse social engineering to understand social engineering attacks. In 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT) (pp. 681-685), Tirunelveli, India. DOI: 10.1109/ICSSIT55814.2023.10061054.
Electronic Transactions Development Agency (ETDA). (2025). ETDA reports over 35,000 online complaint cases in 2024, highlighting online shopping scams and illegal websites as the most reported issues. https://www.etda.or.th/th/pr-news/etda_stat_online_fraund.aspx
Forbes Thailand. (2025). YouTube remains the most used application among Thai users, increasingly viewed as a digital television platform. https://www.forbesthailand.com/news/it/thailand-s-2025-tech-ranking-mobile-nation
Hill, C., Irshaidat, F., Johnson, M., & Fresneda, J. (2025). An analytical assessment of sentiment analysis trends and methods through systematic review and topic modeling. Decision Analytics Journal, 17, Article 100644. https://doi.org/10.1016/j.dajour.2025.100644
Hussain, M. N., Tokdemir, S., Agarwal, N., & Al-Khateeb, S. (2018). Analyzing disinformation and crowd manipulation tactics on YouTube. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 1092-1095), Barcelona, Spain. DOI: 10.1109/ASONAM.2018.8508766
Liu, J., Long, R., Chen, H., Wu, M., Ma, W., & Li, Q. (2024). Topic-sentiment analysis of citizen environmental complaints in China: Using a Stacking-BERT model. Journal of Environmental Management, 371, Article 123112. https://doi.org/10.1016/j.jenvman.2024.123112
Maher, C. A., & Engle, T. A. (2024). Knowing is half the battle: Examining the association between acknowledgement of victimization and reporting of fraud. Journal of Economic Criminology, 5, 100092. https://doi.org/10.1016/j.jeconc.2024.100092
Mouncey, E., & Ciobotaru, S. (2025). Phishing scams on social media: An evaluation of cyber awareness education on impact and effectiveness. Journal of Economic Criminology, 7, Article 100125. https://doi.org/10.1016/j.jeconc.2025.100125
Mouton, F., Malan, M. M., Leenen, L., & Venter, H. S. (2014). Social engineering attack framework. In 2014 Information Security for South Africa (pp. 1-9), Johannesburg, South Africa. DOI: 10.1109/ISSA.2014.6950510
Nataraj-Hansen, S. (2024). “More intelligent, less emotive and more greedy”: Hierarchies of blame in online fraud. International Journal of Law, Crime and Justice, 76, Article 100652. https://doi.org/10.1016/j.ijlcj.2024.100652
Oh, J., & Kim, J. (2025). YouTube as a social listening tool: Mining housing discourse on YouTube. Cities, 166, Article 106279. https://doi.org/10.1016/j.cities.2025.106279
Olivia, T., Halim, E., & Saputra, L. S. (2025). Digital ethics and the overrated phenomenon: Balancing freedom and responsibility in social media. In 2025 13th International Conference on Cyber and IT Service Management (CITSM) (pp. 1-6), Jakarta, Indonesia.
DOI 10.1109/CITSM67730.2025.11291517
Rachamadugu, S. K., Pushphavathi, T. P., Khan, S. B., & Alojail, M. (2024). Exploring topic coherence with PCC-LDA and BERT for contextual word generation. IEEE Access, 12, 175252–175267. https://doi.org/10.1109/ACCESS.2024.3477992
Rathod, T., Jadav, N. K., Tanwar, S., Alabdulatif, A., Garg, D., & Singh, A. (2025). A comprehensive survey on social engineering attacks, countermeasures, case study, and research challenges. Information Processing & Management, 62(1), Article 103928. https://doi.org/10.1016/j.ipm.2024.103928
Sethia, K., Saxena, M., Goyal, M., & Yadav, R. K. (2022). Framework for topic modeling using BERT, LDA and K-Means. In 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE) (pp. 2204-2208). Greater Noida, India. DOI: 10.1109/ICACITE53722.2022.9823442
Triantafyllopoulos, A., Spiesberger, A. A., Tsangko, I., Jing, X., Distler, V., Dietz, F., Alt, F., & Schuller, B. W. (2025). Vishing: Detecting social engineering in spoken communication — A first survey & urgent roadmap to address an emerging societal challenge. Computer Speech & Language, 94, Article 101802. https://doi.org/10.1016/j.csl.2025.101802
Tsinganos, N., Mavridis, I., & Gritzalis, D. (2022). Utilizing convolutional neural networks and word embeddings for early-stage recognition of persuasion in chat-based social engineering attacks. IEEE Access, 10, 108517–108529. https://doi.org/10.1109/ACCESS.2022.3213681
Twardawski, M., Fischer, M., Agostini, P., Schwabe, J., & Gollwitzer, M. (2025). The role of just-world beliefs, victim identifiability, and the salience of an alternative target for victim blaming. Journal of Experimental Social Psychology, 119, Article 104721. https://doi.org/10.1016/j.jesp.2025.104721
Waelchli, S., & Walter, Y. (2025). Reducing the risk of social engineering attacks using SOAR measures in a real world environment: A case study. Computers & Security, 148, Article 104137. https://doi.org/10.1016/j.cose.2024.104137
Yao, W., Tang, M., Liu, Z., & Ni, M. (2025). Social capital and financial fraud among the elderly. International Review of Financial Analysis, 101, Article 104035. https://doi.org/10.1016/j.irfa.2025.104035