In Journal of medical Internet research ; h5-index 88.0
BACKGROUND : Coronavirus disease 2019 (COVID-19) has infected millions of patients worldwide and has been responsible for several hundred thousand fatalities. This has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods for achieving this are lacking.
OBJECTIVE : We analyze Electronic Health Records from COVID-19 positive hospitalized patients admitted to the Mount Sinai Health System in New York City (NYC). We present machine learning models for making predictions about the hospital course over clinically meaningful time horizons based on patient characteristics at admission. We assess performance of these models at multiple hospitals and time points.
METHODS : We utilized XGBoost and baseline comparator models, for predicting in-hospital mortality and critical events at time windows of 3, 5, 7 and 10 days from admission. Our study population included harmonized electronic health record (EHR) data from five hospitals in NYC for 4,098 COVID-19+ patients admitted from March 15, 2020 to May 22, 2020. Models were first trained on patients from a single hospital (N=1514) before or on May 1, externally validated on patients from four other hospitals (N=2201) before or on May 1, and prospectively validated on all patients after May 1 (N=383). Finally, we establish model interpretability to identify and rank variables that drive model predictions.
RESULTS : On cross-validation, the XGBoost classifier outperformed baseline models, with area under the receiver operating characteristic curve (AUC-ROC) for mortality at 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days; XGBoost also performed well for critical event prediction with AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, XGBoost achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers for mortality prediction.
CONCLUSIONS : We trained and validated (both externally and prospectively) machine-learning models for mortality and critical events at different time horizons. These models identify at-risk patients, as well as uncover underlying relationships predicting outcomes.
Vaid Akhil, Somani Sulaiman, Russak Adam J, De Freitas Jessica K, Chaudhry Fayzan F, Paranjpe Ishan, Johnson Kipp W, Lee Samuel J, Miotto Riccardo, Richter Felix, Zhao Shan, Beckmann Noam D, Naik Nidhi, Kia Arash, Timsina Prem, Lala Anuradha, Paranjpe Manish, Golden Eddye, Danieletto Matteo, Singh Manbir, Meyer Dara, O’Reilly Paul F, Huckins Laura, Kovatch Patricia, Finkelstein Joseph, Freeman Robert M, Argulian Edgar, Kasarskis Andrew, Percha Bethany, Aberg Judith A, Bagiella Emilia, Horowitz Carol R, Murphy Barbara, Nestler Eric J, Schadt Eric E, Cho Judy H, Cordon-Cardo Carlos, Fuster Valentin, Charney Dennis S, Reich David L, Bottinger Erwin P, Levin Matthew A, Narula Jagat, Fayad Zahi A, Just Allan C, Charney Alexander W, Nadkarni Girish N, Glicksberg Benjamin