Selection of the Best Machine Learning Model to Predict Poverty Conditions: A Study on North-Eastern Wetland Region of Bangladesh

Authors

  • Rashidul Hasan Department of Statistics, Shahjalal University of Science & Technology, Bangladesh
  • Zakir Hossain Department of Statistics, Shahjalal University of Science & Technology, Bangladesh

Keywords:

Poverty, Determinants, Prediction, Machine Learning, Bangladesh

Abstract

Machine learning (ML) algorithms are effective techniques for predicting households’ poverty conditions so that they might benefit from poverty alleviation programs. The study’s primary objective is to find out the determinants of poverty and select the best ML model to predict the poverty conditions of the north-eastern wetland region of Bangladesh. This study used data from 2340 households that were collected through a household survey by a research project sponsored by the GARE Program, Ministry of Education, GoB. The multiple logistic regression (MLR) model was employed to extract the factors associated with household poverty. Six ML algorithms, including support vector machine, Naïve Bayes, logistic regression, K-nearest neighbor, decision tree, and random forest were applied to predict poverty conditions, and their performances were measured by using accuracy, precision, recall, F1-score, and AUROC. The study’s findings show that district, micro-credit status, household size, age, NGO membership, marital status, per capita income, cultivable land, electricity connection, and livestock ownership are the significant determinants of wetland people’s poverty. The findings also show that the support vector machine is the best model for predicting poverty level LPL with an accuracy of 82%, F1-score of 59%, and AUROC of 72%, and the logistic regression is the best model for predicting poverty level UPL with an accuracy of 81%, F1-score of 84%, and AUROC of 80%. The proposed algorithms may help improve poverty conditions by accurately predicting target poor groups. The determinants may be effective in developing policies to lessen poverty in the wetland region of Bangladesh.

References

Acharya, K. P., Khanal, S. P., & Chhetry, D. (2022). Factors affecting poverty in Nepal- A binary logistic regression model study. Pertanika Journal Social Science and Humanities, 30(2), 641-663.

Achia, T. N. O., Wangombe, A., & Khadioli, N. (2010). A logistic regression model to identify key determinants of poverty using demographic and health survey data. European Journal of Social Sciences, 13(1), 38-45.

Alsharkawi, A., Al-Fetyani, M., Dawas, M., Saadeh, H., & Alyaman, M. (2021). Poverty classification using machine learning: The case of Jordan. Sustainability, 13(3), 2-16.

Bangladesh Bureau of Statistics. (2023). Key findings: Household income and expenditure survey-2022. Statistics and Informatics Division, Ministry of Planning, Government of the People’s Republic of Bangladesh, Dhaka, Bangladesh. Retrieved from https://bbs.portal.gov.bd/sites/default/files/files/bbs.portal.gov.bd/page/57def76a_aa3c_46e3_9f80_53732eb94a83/2023-04-13-09-35-ee41d2a35dcc47a94a595c88328458f4.pdf

Bangladesh Bureau of Statistics. (2017). Preliminary report of household income and expenditure survey-2016. Statistics and Informatics Division, Ministry of Planning, Government of the People’s Republic of Bangladesh, Dhaka, Bangladesh. Retrieved from https://bbs.portal.gov.bd/sites/default/files/files/bbs.portal.gov.bd/page/b343a8b4_956b_45ca_872f_4cf9b2f1a6e0/HIES%20Preliminary%20Report%202016.pdf.

Biyase, M., & Zwane, T. (2017). An empirical analysis of the determinants of poverty and household welfare in South Africa. Retrieved from Munich Personal RePEc Archive https://mpra.ub.uni-muenchen.de/77085/

Borko, Z. P. (2017). Determinants of poverty in rural households (the case of Damot Gale district in Wolaita Zone, Ethiopia): A household level analysis. International Journal of African and Asian Studies, 29, 68-75.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.

Carter, M. R., & Barrett, C. B. (2009). The economics of poverty traps and persistent poverty: An asset‐based approach. Journal of Development Studies, 42(2), 178-99.

Centre for Environmental and Geographic Information Services. (2012). Master plan of Haor area. Ministry of Water Resources, Bangladesh Haor and Wetland Development Board, Dhaka, Bangladesh. Retrieved from https://dbhwd.portal.gov.bd/sites/default/files/files/dbhwd.portal.gov.bd/publications/baf5341d_f248_4e19_8e6d_e7ab44f7ab65/Haor%20Master%20Plan%20Volume%201.pdf

Chowdhury, A. (2014). Factors affecting productivity and efficiency of rice production in haor area in Bangladesh: Likely impact on food security (Master’s Thesis, Bangladesh Agricultural University, Mymensingh, Bangladesh). Retrieved from https://catalog.ihsn.org/citations/42191

Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20(3), 273-297.

Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27

Doss, C., Truong, M., Nabanoga, G., & Namaalwa, J. (2012). Women, marriage and asset inheritance in Uganda. Development Policy Review, 30(5), 597-616.

Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York, USA: John Wiley and Sons. Retrieved from https://www.semanticscholar.org/paper/Pattern-classification-and-scene-analysis-Duda Hart/b07ce649d6f6eb636872527104b0209d3edc8188

Fix, E., & Hodges, J. L. (1951). Discriminatory analysis: Nonparametric discrimination, small sample performance. USA: Air University, USAF School of Aviation Medicine, 1952.

Hasan, M. R., & Hossain, M. Z. (2024). Factors affecting formal micro-credits in the wetland regions of Bangladesh: A discriminant analysis. International Journal of Statistical Sciences, 24(2), 85-96.

Hashemi, S. M., Schuler, S. R., & Riley, A. P. (1996). Rural credit programs and women’s empowerment in Bangladesh. World Development, 24(4), 635-653.

Imam, M. F., Islam, M. A., & Hossain, M. J. (2018). Factors affecting poverty in rural Bangladesh: An analysis using multilevel modelling. Journal of the Bangladesh Agricultural University, 16(1), 123-130.

Kambuya, P. (2020). Better model selection for poverty targeting through machine learning: A case study in Thailand. Thailand and the World Economy, 38(1), 91-116.

Kazal, M. M. H., Rahman, S., & Hossain, M. Z. (2017). Poverty profiles and coping strategies of the haor (ox-bow lake) households in Bangladesh. Journal of Poverty Alleviation and International Development, 8(1), 167-191.

Khondker, B. H., & Mahzab, M. M. (2015). Lagging districts development: Background study paper for preparation of the seventh five-year plan. Retrieved from https://www.researchgate.net/publication/332567169_Lagging_Districts_Development_Background_Study_Paper_for_Preparation_of_the_Seventh_Five-Year_Plan

Kim, J. Y. (2021). Using machine learning to predict poverty status in Costa Rican households. SSRN Electronic Journal, 1-13. Retrieved from https://doi.org/10.2139/ssrn.3971979

Korankye, A. A. (2014). Causes of poverty in Africa: A review of literature. American International Journal of Social Science, 3(7), 147-153.

Langley, P., Iba, W., & Thompson, K. (1992, January). An analysis of Bayesian classifiers. Proceedings of the 10th National Conference on Artificial Intelligence, San Jose: AAAI Press, 223-228. Retrieved from https://cdn.aaai.org/AAAI/1992/AAAI92-035.pdf

Li, Q., Yu, S., Échevin, D., & Fan, M. (2022). Is poverty predictable with machine learning? A study of DHS data from Kyrgyzstan. Socio-Economic Planning Sciences, 81.

Min, P. P., Gan, Y. W., Hamzah, S. N. B., Ong, T. S., & Sayeed, M. S. (2022). Poverty prediction using machine learning approach. Journal of Southwest Jiaotong University, 57(1), 136-146.

Ministry of Law. (2013). Bangladesh Water Act, 2013. Justice and Parliamentary Affairs, Legislative and Parliamentary Affairs Division, Government of the People’s Republic of Bangladesh. Retrieved from http://oldweb.lged.gov.bd/

uploadeddocument/UnitPublication/1/840/Water%20Act%202013%20(English).pdf

Mohamud, J. H., & Gerek, O. N. (2019, April). Poverty level characterization via feature selection and machine learning. IEEE 2019 27th Signal Processing and Communications Applications Conference (SIU), Sivas, Turkey, 1-4. Retrieved from http://dx.doi.org/10.1109/SIU.2019.8806548

Myers, R. H., Montgomery, D. C., Vining, G. G., & Robinson, T. J. (2012). Generalized linear models with applications in engineering and the sciences. New York, USA: John Wiley and Sons.

Ogwumike, F. O., & Akinnibosun, M. K. (2013). Determinants of poverty among farming households in Nigeria. Mediterranean Journal of Social Sciences, 4(2), 365-373.

Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81-106.

Rahman, S. (2007). The impact of microcredit on poverty and women’s empowerment: A case study of Bangladesh (Doctoral Thesis, Western Sydney University, Sydney, Australia). Retrieved from http://handle.uws.edu.au:8081/1959.7/36990

Rhoumah, A. (2016). Determinants of factors that affect poverty among coastal fishermen community in Malaysia. IOSR Journal of Economics and Finance, 7(3), 9-13.

Sani, N. S., Rahman, M. A., Bakar, A. A., Sahran, S., & Sarim, H. M. (2018). Machine learning approach for bottom 40 percent households (B40) poverty classification. International Journal on Advanced Science, Engineering and Information Technology, 8(4-2), 1698-1705.

Santa, G. M., & Ruiz, L. C. M. (2023). Predicting multidimensional poverty with machine learning algorithms: An open data source approach using spatial data. Social Sciences, 12(5), 2-21.

Santoso, S., & Irwan, M. I. (2016). Classification of poverty levels using k-nearest neighbor and learning vector quantization methods. International Journal of Computing Science and Applied Mathematics, 2(1), 8-13.

Shen, T., Zhan, Z., Jin, L., Huang, F., & Xu, H. (2021, June). Research on method of identifying poor families based on machine learning. IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China, 10-13. Retrieved from https://doi.org/10.1109/IMCEC51613.2021.9482142

Sheng, W., Yumei, S. (2021). Prediction poverty levels of college students using a machine learning model. Retrieved from Research Square https://doi.org/10.21203/rs.3.rs-919541/v1

Sohnesen, T. P., & Stender, N. (2017). Is random forest a superior methodology for predicting poverty? An empirical assessment. Poverty & Public Policy, 9(1), 118-133.

Spaho, A. (2014). Determinants of poverty in Albania. Journal of Educational and Social Research, 4(2), 157-163.

Talingdan, J. A. (2019, May). Performance comparison of different classification algorithms for household poverty classification. IEEE 2019 4th International Conference on Information Systems Engineering, Shanghai, China, 11-15. Retrieved from http://dx.doi.org/10.1109/ICISE.2019.00010

Thoplan, R. (2014). Random forests for poverty classification. International Journal of Sciences: Basic and Applied Research, 17(2), 252-259.

Wang, S., Zhao, Y., & Zhao, Y. (2020). Costa Rican poverty level prediction. IETI Transactions on Social Sciences and Humanities, 7(2020/05), 171-176.

Wong, G. (2022). Poverty prediction and the identification of discriminative features on household data from Cambodia. TechRxiv, 1-6.

Yang, S., & Berdine, G. (2017). The receiver operating characteristic (ROC) curve. The Southwest Respiratory and Critical Care Chronicles, 5(19), 34-36.

Zixi, H. (2021, March). Poverty prediction through machine learning. 2nd International Conference on E-Commerce and Internet Technology, Hangzhou, China, 314-324. Retrieved from https://doi.org/10.1109/ECIT52743.2021.00073

Downloads

Published

2025-05-07

How to Cite

Hasan, R., & Zakir Hossain. (2025). Selection of the Best Machine Learning Model to Predict Poverty Conditions: A Study on North-Eastern Wetland Region of Bangladesh. Thailand and The World Economy, 43(2), 138–162. retrieved from https://so05.tci-thaijo.org/index.php/TER/article/view/268287