Effectiveness of Machine Learning for COVID-19 Patient Mortality Prediction Using WEKA

Husnul Khuluq, Prasandhya Astagiri Yusuf, Dyah Aryani Perwitasari

Abstract


Timely detection of patients with a high mortality risk in coronavirus disease 2019 (COVID-19) can substantially improve triage, bed allocation, time reduction, and potential outcomes. A potential solution is using machine learning (ML) algorithms to predict mortality in COVID-19 hospitalized patients. The study's objective was to create and verify individual risk assessments for mortality using anonymous demographic, clinical, and laboratory findings at admission, as well as to assess the possibility of death using machine learning. We used a standardized format and electronic medical records. Data from 2,313 patients were collected from two Muhammadiyah hospitals from January 2020 to July 2022. Utilizing each patient's clinical manifestation state at admission and laboratory parameters, 24 demographic, clinical, and laboratory results were studied. The algorithms analyzed were AdaBoost, logistic regression, random forest, support vector machine, naïve Bayes, and decision tree, which were applied through WEKA version 3.8.6. Random forest performed better than the other machine learning techniques, with precision, sensitivity, receiver operating characteristic (ROC), and accuracy of 78.6%, 78.7%, 85%, and 78.65%, respectively. The three top predictors were septic shock (OR=21.518, 95% CI=4.933–93.853), respiratory failure (OR=15.503, 95% CI=8.507–28.254), and D-dimer (OR=3.288, 95% CI=2.510–4.306). Machine learning–based predictive models, especially the random forest algorithm, may make it easier to identify patients at high risk of death and guide physicians' appropriate interventions.


Keywords


Data mining; inpatient mortality; machine learning algorithm; prediction model

Full Text:

PDF

References


Centers for Disease Control and Prevention. COVID data tracker [Internet]. Atlanta: Centers for Disease Control and Prevention; 2023 [cited 2023 April 10]. Available from: https://covid.cdc.gov/covid-data-tracker.

Singhal T. A review of coronavirus disease-2019 (COVID-19). Indian J Pediatr. 2020;87(4):281–6.

Cucinotta D, Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed. 2020;91(1):157–60.

Cascella M, Rajnik M, Aleem A, Dulebohn SC DNR. Features, evaluation, and treatment of coronavirus. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2023 [cited 2023 April 20]. Available from: https://www.ncbi.nlm.nih.gov/books/NBK554776.

Cummings MJ, Baldwin MR, Abrams D, Jacobson SD, Meyer BJ, Balough EM, et al. Epidemiology, clinical course, and outcomes of critically ill adults with COVID-19 in New York city: a prospective cohort study. Lancet. 2020;395(10239):1763–70.

Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395(10229):1054–62.

Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng. 2018;2(10):719–31.

An C, Lim H, Kim DW, Chang JH, Choi YJ, Kim SW. Machine learning prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean cohort study. Sci Rep [Internet]. 2020;10(1):18716.

Kar S, Chawla R, Haranath SP, Ramasubban S, Ramakrishnan N, Vaishya R, et al. Multivariable mortality risk prediction using machine learning for COVID-19 patients at admission (AICOVID). Sci Rep. 2021;11(1):12801.

Kwok SWH, Wang G, Sohel F, Kashani KB, Zhu Y, Wang Z, et al. An artificial intelligence approach for predicting death or organ failure after hospitalization for COVID-19: development of a novel risk prediction tool and comparisons with ISARIC-4C, CURB-65, qSOFA, and MEWS scoring systems. Respir Res. 2023;24(1):79.

Attwal KPS, Dhiman AS. Exploring data mining tool-WEKA and using WEKA to build and evaluate predictive models. Adv Appl Math Sci. 2020;19(6):451–69.

Brownlee J. Statistical methods for machine learning: discover how to transform data into knowledge with python [e-book]. San Juan: Machine Learning Mastery; 2019 [cited 2023 May 10]. Available from: https://machinelearningmastery.com/statistics_for_machine_learning.

Brownlee J. SMOTE for imbalanced classification with python [Internet]. San Juan: Machine Learning Mastery; 2021 [cited 2023 May 15]. Available from: https://machinelearningmastery.com/smote-oversampling-for-imbalanced-classification.

Indarto, Utami E, Raharjo S. Mortality prediction using data mining classification techniques in patients with hemorrhagic stroke. In: Proceeding Virtual Conference of the 2020 8th International Conference on Cyber and IT Service Management (CISTM); 2020 October 23–24; Pangkalpinang, Indonesia. Piscataway: Institute of Electrical and Electronics Engineers; 2020 [cited 2023 May 20]. p. 1–5. Available from: https://ieeexplore.ieee.org/document/9268802.

Nahm FS. Receiver operating characteristic curve: overview and practical use for clinicians. Korean J Anesthesiol. 2022;75(1):25–36.

Wu CC, Yeh WC, Hsu WD, Islam MM, Nguyen PAA, Poly TN, et al. Prediction of fatty liver disease using machine learning algorithms. Comput Methods Programs Biomed. 2019;170:23–9.

Laino ME, Generali E, Tommasini T, Angelotti G, Aghemo A, Desai A, et al. An individualized algorithm to predict mortality in COVID-19 pneumonia: a machine learning based study. Arch Med Sci. 2022;18(3):587–95.

Akbarzadeh M, Alipour N, Moheimani H, Zahedi AS, Hosseini-Esfahani F, Lanjanian H, et al. Evaluating machine learning-powered classification algorithms which utilize variants in the GCKR gene to predict metabolic syndrome: Tehran Cardio-metabolic Genetics Study. J Transl Med. 2022;20(1):164.

Zhou S, Mentch L. Trees, forests, chickens, and eggs: when and why to prune trees in a random forest. Stat Anal Data Min. 2023;16(1):45–64.

Muhammad LJ, Islam MM, Usman SS, Ayon SI. Predictive data mining models for novel coronavirus (COVID-19) infected patients’ recovery. SN Comput Sci. 2020;1(4):206.

Salari N, Kazeminia M, Sagha H, Daneshkhah A, Ahmadi A, Mohammadi M. The performance of various machine learning methods for Parkinson’s disease recognition: a systematic review. Curr Psychol. 2022;42:16637–60.

Chen R, Liang W, Jiang M, Guan W, Zhan C, Wang T, et al. Risk factors of fatal outcome in hospitalized subjects with coronavirus disease 2019 from a nationwide analysis in China. Chest. 2020;158(1):97–105.

Dadras O, SeyedAlinaghi SA, Karimi A, Shamsabadi A, Qaderi K, Ramezani M, et al. COVID-19 mortality and its predictors in the elderly: a systematic review. Health Sci Rep. 2022;5(3):e657.

Doerre A, Doblhammer G. The influence of gender on COVID-19 infections and mortality in Germany: insights from age- and gender-specific modeling of contact rates, infections, and deaths in the early phase of the pandemic. PLoS One. 2022;17(5):e0268119.

Peckham H, de Gruijter NM, Raine C, Radziszewska A, Ciurtin C, Wedderburn LR, et al. Male sex identified by global COVID-19 meta-analysis as a risk factor for death and ITU admission. Nat Commun. 2020 Dec;11(1):6317.

Bahl A, Van Baalen MN, Ortiz L, Chen NW, Todd C, Milad M, et al. Early predictors of in-hospital mortality in patients with COVID-19 in a large American cohort. Intern Emerg Med. 2020;15(8):1485–99.

Tezza F, Lorenzoni G, Azzolina D, Barbar S, Leone LAC, Gregori D. Predicting in-hospital mortality of patients with COVID-19 using machine learning techniques. J Pers Med. 2021;11(5):343.

Kim HR, Jin HS, Eom YB. Association between manba gene variants and chronic kidney disease in a Korean population. J Clin Med. 2021;10(11):2255.

Kumar A, Arora A, Sharma P, Anikhindi SA, Bansal N, Singla V, et al. Is diabetes mellitus associated with mortality and severity of COVID-19? A meta-analysis. Diabetes Metab Syndr. 2020;14(4):535–45.

Barron E, Bakhai C, Kar P, Weaver A, Bradley D, Ismail H, et al. Associations of type 1 and type 2 diabetes with COVID-19-related mortality in England: a whole-population study. Lancet Diabetes Endocrinol. 2020;8(10):813–22.

Lu Q, Wang Z, Yin Y, Zhao Y, Tao P, Zhong P. Association of peripheral lymphocyte and the subset levels with the progression and mortality of COVID-19: a systematic review and meta-analysis. Front Med (Lausanne). 2020;7:558545.

Li X, Xu S, Yu M, Wang K, Tao Y, Zhou Y, et al. Risk factors for severity and mortality in adult COVID-19 inpatients in Wuhan. J Allergy Clin Immunol. 2020;146(1):110–8.




DOI: https://doi.org/10.29313/gmhc.v11i3.12119

pISSN 2301-9123 | eISSN 2460-5441


Visitor since 19 October 2016: 


Free counters!


Global Medical and Health Communication is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.