In Medical & biological engineering & computing ; h5-index 32.0
Cardiovascular diseases are among the leading causes of mortality worldwide, with more than 23 million related deaths per year by 2030, according to the World Heart Federation. Although most of these diseases may be prevented, population awareness strategies are still ineffective. In this context, we propose the CML-Cardio tool, a machine learning application to automate the risk classification process of developing CVDs. For this, researchers in our group collected data on diabetes, blood pressure, and other risk factors in a private company. Our final model consists of a cascade system to handle highly imbalanced data. In the first stage, a binary model is responsible for predicting whether a patient has a low risk of developing CVDs or if has a risk that needs attention. In this step, we use six algorithms: logistic regression, SVM, random forest, XGBoost, CatBoost, and multilayer perceptron. The better results presented an average accuracy of 0.86 ± 0.03 and f-score of 0.85 ± 0.04. We interpret each feature's impact on the models' output and validate the subsystem for the next step. In the second stage, we use an anomaly detection model to learn the intermediate risk patterns present in the instances that need attention. The cascade model presented an average accuracy of 0.80 ± 0.07 and f-score of 0.70 ± 0.07. Finally, we develop the CML-Cardio prototype of an actual application as a primary prevention strategy. Graphical abstract In this work, we propose the CML-Cardio tool, a cascade machine learning method to classify cardiovascular disease risk.
Oliveira Bruno Alberto Soares, Castro Giulia Zanon, Ferreira Giovanna Luiza Medina, GuimarĂ£es Frederico Gadelha
2023-Jan-31
Anomaly detection, Cardiovascular diseases, Explainable artificial intelligence, Healthcare, Machine learning, Primary prevention