Classifying Adverse Drug Reactions from Electronic Medical Records using Machine Learning and Natural Language Processing

Main Article Content

Sasiwimon Kobua
Akkapon Wongkoblap
Thara Angskun
Jitimon Angskun

Abstract

Background and Objectives: Patient drug safety is a priority in the World Health Organization (WHO)'s action plan. In Thailand, psychiatric medications are among the top five drug classes linked to adverse events. However, analyzing adverse drug reactions (ADRs) can be a time-consuming process, and variations may arise due to differences in the expertise of medical personnel. Classifying ADRs in psychiatric patients who use bilingual (Thai-English) electronic medical records is complicated by linguistic structures, transliterations, and non-standard spellings, which can hinder the accuracy of existing classification models. This study aims to develop a model for classifying ADRs associated with psychiatric medications in bilingual text. It will compare machine learning and deep learning techniques to enhance medication safety surveillance within Thailand's healthcare context.


Methodology: The process began with collecting ADE data from psychiatric patients between October 2019 and March 2023. These data were then processed to develop a model for classifying ADRs using natural language processing and five traditional machine learning techniques: Gaussian Naive Bayes, K-Nearest Neighbors, Linear Support Vector Machine, Logistic Regression, and Random Forest. Two deep learning techniques were utilized: Convolutional Neural Networks and Bidirectional Long Short-Term Memory Neural Networks. Finally, the model's performance was evaluated using 70% of the data for training and 30% for testing.


Main Results: The findings indicate that the model developed using the Random Forest technique, combined with feature extraction methods incorporating both 1-gram and 2-gram approaches, achieved the highest level of classification performance. When tested with a dataset of 384 samples, this model achieved an accuracy, precision, recall, and F1-score of 84.6%, 82.6%, 84.6%, and 83.2%, respectively. The developed model demonstrated high effectiveness in accurately classifying adverse drug reactions, showcasing its robustness as a machine learning method in the medical domain. Additionally, regarding computational efficiency, the Linear Support Vector Machine and Logistic Regression models exhibited the fastest processing speeds, making them noteworthy options for scenarios requiring rapid decision-making.


Discussions: The findings of this study align with several previous studies, although variations may arise due to the characteristics of the data and the complexity of bilingual (Thai and English) content in electronic medical records. Deep learning methods are resource-intensive and time-consuming; thus, selecting the appropriate technique requires balancing accuracy and speed. Real-time systems may favor Linear Support Vector Machines, even though they are slightly less accurate than Random Forests. Future research should incorporate data balancing techniques, advanced feature extraction, expanding data sources, and experimenting with language models specifically developed for the Thai language and medical contexts.


Conclusions: This study demonstrates the potential of utilizing machine learning and deep learning techniques to detect adverse drug reactions, highlighting important implications for developing an efficient surveillance system. Such a system can enhance the speed and accuracy of identifying events that might otherwise be missed, while reducing the workload for healthcare personnel, allowing them to concentrate more on patient care. The proposed approach aligns with the WHO’s guidance and Thailand’s digital health policies.

Article Details

Section
Research Articles

References

Ahmad, F., Abbasi, A., Kitchens, B., Adjeroh, D., & Zeng, D. (2020). Deep learning for adverse event detection from web search. IEEE Transactions on Knowledge and Data Engineering, 34(6), 2681-2695. https://doi.org/10.1109/TKDE.2020.3017786

Alomar, M. J. (2014). Factors affecting the development of adverse drug reactions. Saudi pharmaceutical journal, 22(2), 83-94. https://doi.org/10.1016/j.jsps.2013.02.003

Chaichulee, S., Promchai, C., Kaewkomon, T., Kongkamol, C., Ingviya, T., & Sangsupawanich, P. (2022). Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing. PLoS One, 17(8), e0270595. https://doi.org/10.1371/journal.pone.0270595

Dongre, S., & Agrawal, J. (2023). Deep-Learning-Based Drug Recommendation and ADR Detection Healthcare Model on Social Media. IEEE Transactions on Computational Social Systems, 10(4), 1791–1799. https://doi.org/10.1109/TCSS.2022.3231701

Guo, K., Feng, Z., Chen, S., Yan, Z., Jiao, Z., & Feng, D. (2022). Safety Profile of Antipsychotic Drugs: Analysis Based on a Provincial Spontaneous Reporting Systems Database. Frontiers in Pharmacology, 13, 848472. https://doi.org/10.3389/fphar.2022.848472

Iqbal, E. (2021). Investigating adverse effects of psychiatric drugs through data-mining of electronic health records [Doctoral dissertation]. King’s College London.

Ishikawa, T., Yakoh, T., & Urushihara, H. (2022). An NLP-inspired data augmentation method for adverse event prediction using an imbalanced healthcare dataset. IEEE Access, 10, 81166-81176. https://doi.org/10.1109/ACCESS.2022.3195212

Kobua, S., Wongkoblap, A., Angskun, T., & Angskun, J. (2025). A Model for Analyzing the Severity Level of Adverse Drug Reactions using Machine Learning. Journal of Science and Technology Mahasarakham University, 44(1), 59-71. (In Thai)

Kunakorntham, P., Pattanaprateep, O., Dejthevaporn, C., Thammasudjarit, R., & Thakkinstian, A. (2022). Detection of statin-induced rhabdomyolysis and muscular related adverse events through data mining technique. BMC Medical Informatics and Decision Making, 22(1), 233. https://doi.org/10.1186/s12911-022-01978-4

McMaster, C., Chan, J., Liew, D. F., Su, E., Frauman, A. G., Chapman, W. W., & Pires, D. E. (2023). Developing a deep learning natural language processing algorithm for automated reporting of adverse drug reactions. Journal of biomedical informatics, 137104265. https://doi.org/10.1016/j.jbi.2022.104265

Nafea, A. A., Omar, N., & Al-qfail, Z. M. (2024). Artificial neural network and latent semantic analysis for adverse drug reaction detection. Baghdad Science Journal, 21(1), 19. https://doi.org/10.21123/bsj.2023.7988

Naranjo, C. A., Busto, U., Sellers, E. M., Sandor, P., Ruiz, I., Roberts, E. A., ... & Greenblatt, D. J. (1981). A method for estimating the probability of adverse drug reactions. Clinical Pharmacology & Therapeutics, 30(2), 239-245. https://doi.org/10.1038/clpt.1981.154

Nishioka, S., Watanabe, T., Asano, M., Yamamoto, T., Kawakami, K., Yada, S., Aramaki, E., Yajima, H., Kizaki, H., & Hori, S. (2022). Identification of hand-foot syndrome from cancer patients' blog posts: BERT-based deep-learning approach to detect potential adverse drug reaction symptoms. PloS one, 17(5), e0267901. https://doi.org/10.1371/journal.pone.0267901

Nithinsha, S., & Anusuya, S. (2023). Robust Adverse Drug Reaction Prediction and Classification by Employing Deer Hunting Optimization Driven Deep Learning Approach. International Journal of Electrical and Electronics Engineering, 10(5), 48–59. https://doi.org/10.14445/23488379/IJEEE-V10I5P105

Phianthai, B., & Vetchakun-anukun, K. (2020). The Association between Potential Factors and Seriousness Level of Adverse Drug Reactions (ADRs) at King Narai Hospital: a cross-sectional study. Journal of Health and Environmental Education, 5(2), 134-146. (In Thai)

Rawat, A., Wani, M. A., ElAffendi, M., Imran, A. S., Kastrati, Z., & Daudpota, S. M. (2022). Drug adverse event detection using text-based convolutional neural networks (TextCNN) technique. Electronics, 11(20), 3336. https://doi.org/10.3390/electronics11203336

Roosan, D., Law, A. V., Roosan, M. R., & Li, Y. (2022). Artificial Intelligent Context-Aware Machine-Learning Tool to Detect Adverse Drug Events from Social Media Platforms. Journal of medical toxicology: official journal of the American College of Medical Toxicology, 18(4), 311–320. https://doi.org/10.1007/s13181-022-00906-2

Sisay, T., & Wami, R. (2021). Adverse drug reactions among major depressive disorders: patterns by age and gender. Heliyon, 7(12), e08655. https://doi.org/10.1016/j.heliyon.2021.e08655

Songsiriphan, R. (2018). Factors Related to a Severity Level of Adverse Drug Reactions (ADRs). The Southern College Network Journal of Nursing and Public Health, 5(2), 46-56. (In Thai)

Spandana, S., & Prakash, R. V. (2024). Multiple features-based adverse drug reaction detection from social media using deep convolutional neural networks (DCNN). Multimedia Tools and Applications, 83(26), 67779-67793. https://doi.org/10.1007/s11042-024-18144-9

Suwankesawong, W., Sriphiromya, P., Tragulpiankit, P., Phetcharat, C., & Sornsrivichai, V. (2016). Evaluation of Thai algorithm usage for adverse drug reaction monitoring. Journal of Health Science of Thailand, 25(4), 673–682. (In Thai)

Tan, H. X., Teo, C. H. D., Ang, P. S., Loke, W. P. C., Tham, M. Y., Tan, S. H., ... & Dorajoo, S. R. (2022). Combining machine learning with a rule-based algorithm to detect and identify related entities of documented adverse drug reactions on hospital discharge summaries. Drug safety, 45(8), 853-862. https://doi.org/10.1007/s40264-022-01196-x

Tiamkaew, K. (2014). Guidelines for dental treatment of psychiatric patients (2nd ed., 1st printing). Srithanya Hospital, Ministry of Public Health. (In Thai)

World Health Organization. (2021). The global patient safety action plan 2021-2030. Towards eliminating avoidable harm in health care.

Wu, X. W., Zhang, J. Y., Chang, H., Song, X. W., Wen, Y. L., Long, E. W., & Tong, R. S. (2022). Develop an ADR prediction system of Chinese herbal injections containing Panax notoginseng saponin: a nested case–control study using machine learning. BMJ Open, 12(9), e061457. https://doi.org/10.1136/bmjopen-2022-061457

Zitu, M. M., Zhang, S., Owen, D. H., Chiang, C., & Li, L. (2023). Generalizability of machine learning methods in detecting adverse drug events from clinical narratives in electronic medical records. Frontiers in pharmacology, 14, 1218679. https://doi.org/10.3389/fphar.2023.1218679