In Revista de investigacion clinica; organo del Hospital de Enfermedades de la Nutricion
BACKGROUND : The coronavirus disease (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus and is responsible for nearly 6 million deaths worldwide in the past 2 years. Machine learning (ML) models could help physicians in identifying high-risk individuals.
OBJECTIVES : To study the use of ML models for COVID-19 prediction outcomes using clinical data and a combination of clinical and metabolic data, measured in a metabolomics facility from a public university.
METHODS : A total of 154 patients were included in the study. "Basic profile" was considered with clinical and demographic variables (33 variables), whereas in the "extended profile," metabolomic and immunological variables were also considered (156 characteristics). A selection of features was carried out for each of the profiles with a genetic algorithm (GA) and random forest models were trained and tested to predict each of the stages of COVID-19.
RESULTS : The model based on extended profile was more useful in early stages of the disease. Models based on clinical data were preferred for predicting severe and critical illness and death. ML detected trimethylamine N-oxide, lipid mediators, and neutrophil/lymphocyte ratio as important variables.
CONCLUSIONS : ML and GAs provided adequate models to predict COVID-19 outcomes in patients with different severity grades.
Villagrana-Bañuelos Karen E, Maeda-Gutiérrez Valeria, Alcalá-Rmz Vanessa, Oropeza-Valdez Juan J, Herrera-Van Oostdam Ana S, Castañeda-Delgado Julio E, López Jesús Adrián, Borrego Moreno Juan C, Galván-Tejada Carlos E, Galván-Tejeda Jorge I, Gamboa-Rosales Hamurabi, Luna-García Huizilopoztli, Celaya-Padilla José M, López-Hernández Yamilé
Biomarker, COVID-19, Genetic algorithm, LC-MS, Machine learning, Metabolomics, Random forest