In Computers in biology and medicine
AIM : COVID-19 has revealed the need for fast and reliable methods to assist clinicians in diagnosing the disease. This article presents a model that applies explainable artificial intelligence (XAI) methods based on machine learning techniques on COVID-19 metagenomic next-generation sequencing (mNGS) samples.
METHODS : In the data set used in the study, there are 15,979 gene expressions of 234 patients with COVID-19 negative 141 (60.3%) and COVID-19 positive 93 (39.7%). The least absolute shrinkage and selection operator (LASSO) method was applied to select genes associated with COVID-19. Support Vector Machine - Synthetic Minority Oversampling Technique (SVM-SMOTE) method was used to handle the class imbalance problem. Logistics regression (LR), SVM, random forest (RF), and extreme gradient boosting (XGBoost) methods were constructed to predict COVID-19. An explainable approach based on local interpretable model-agnostic explanations (LIME) and SHAPley Additive exPlanations (SHAP) methods was applied to determine COVID-19- associated biomarker candidate genes and improve the final model's interpretability.
RESULTS : For the diagnosis of COVID-19, the XGBoost (accuracy: 0.930) model outperformed the RF (accuracy: 0.912), SVM (accuracy: 0.877), and LR (accuracy: 0.912) models. As a result of the SHAP, the three most important genes associated with COVID-19 were IFI27, LGR6, and FAM83A. The results of LIME showed that especially the high level of IFI27 gene expression contributed to increasing the probability of positive class.
CONCLUSIONS : The proposed model (XGBoost) was able to predict COVID-19 successfully. The results show that machine learning combined with LIME and SHAP can explain the biomarker prediction for COVID-19 and provide clinicians with an intuitive understanding and interpretability of the impact of risk factors in the model.
Yagin Fatma Hilal, Cicek İpek Balikci, Alkhateeb Abedalrhman, Yagin Burak, Colak Cemil, Azzeh Mohammad, Akbulut Sami
COVID-19, Explainable artificial intelligence, LIME, SHAP, XGBoost