In JMIR medical informatics ; h5-index 23.0
BACKGROUND : Coronavirus disease 2019 (COVID-19) has overwhelmed health systems worldwide. It is important to identify severe cases as early as possible, so that resources can be mobilized and treatment can be escalated.
OBJECTIVE : This study aims to develop a machine learning approach for automated severity assessment of COVID-19 patients based on clinical and imaging data.
METHODS : Clinical data-demographics, signs, symptoms, comorbidities and blood test results-and chest computer tomography (CT) scans of 346 patients from two hospitals in the Hubei province, China, were used to develop machine learning models for automated severity assessment of diagnosed COVID-19 cases. We compared the predictive power of clinical and imaging data by testing multiple machine learning models, and further explored the use of four oversampling methods to address the imbalance distribution issue. Features with the highest predictive power were identified using the SHapley Additive exPlanations (SHAP) framework.
RESULTS : Imaging features had the strongest impact on the model output, while a combination of clinical and imaging features yielded the best performance overall. The identified predictive features were consistent with findings from previous studies. Oversampling yielded mixed results, although it achieved the best model performance in our study. Targeting differentiation between mild and severe cases, logistic regression models achieved the best performance on clinical features (area under the curve [AUC]:0.848, sensitivity:0.455, specificity:0.906), imaging features (AUC:0.926, sensitivity:0.818, specificity:0.901) and the combined features (AUC:0.950, sensitivity:0.764, specificity:0.919). The SMOTE oversampling method further improved the performance of the combined features to AUC of 0.960 (sensitivity:0.845, specificity:0.929).
CONCLUSIONS : This study indicates that clinical and imaging features can be used for automated severity assessment of COVID-19 patients and have the potential to assist with triaging COVID-19 patients and prioritizing care for patients at higher risk of severe cases.
Quiroz Juan Carlos, Feng You-Zhen, Cheng Zhong-Yuan, Rezazadegan Dana, Chen Ping-Kang, Lin Qi-Ting, Qian Long, Liu Xiao-Fang, Berkovsky Shlomo, Coiera Enrico, Song Lei, Qiu Xiao-Ming, Liu Sidong, Cai Xiang-Ran