In Computer methods and programs in biomedicine
BACKGROUND AND OBJECTIVE : Coronary artery disease (CAD) is considered one of the most prominent health issues causing high mortality in the world population. Hence, earlier diagnosis and prediction of CAD is essential for the proper medication of patients. The objective of this study is to develop a machine learning algorithm that will help in accurate diagnosis of CAD.
METHODS : In this paper, we have proposed a novel heterogeneous ensemble method combining three base classifiers viz., K-Nearest Neighbour, Random Forest, and Support Vector Machine for effective diagnosis of CAD. The results of base classifiers are combined using ensemble voting technique based on average-voting (AVEn), majority-voting (MVEn), and weighted-average voting (WAVEn) for prediction of CAD. The random forest-based Boruta wrapper feature selection algorithm and feature importance of SVM are used for relevant feature selection based on attribute importance and rank.
RESULTS : The proposed ensemble algorithm is developed using 5 features selected based on the feature importance and the performance of the algorithm is evaluated using the Z-Alizadeh Sani dataset. Further, the dataset is balanced using Synthetic Minority Over-sampling Technique and its performance is evaluated. The result analysis shows that the WAVEn algorithm achieves better classification accuracy, sensitivity, specificity and precision of 98.97%, 100%, 96.3% and 98.3% respectively for the original dataset. The WAVEn algorithm applied on the balanced dataset achieves 100% accuracy, sensitivity, specificity and precision in diagnosing CAD. To the best of author's knowledge, the accuracy achieved by WAVEn is the highest accuracy when compared with the state-of-the-art algorithms in the literature for both original and balanced dataset.
CONCLUSIONS : The statistical results prove the robustness of the WAVEn algorithm in reliably discriminating the CAD patients from healthy ones with high precision, and therefore it can be used for developing a decision support system for diagnosing CAD at an early stage.
Velusamy Durgadevi, Ramasamy Karthikeyan
Cardiovascular disease, Classification, Coronary artery disease, Ensemble methods, Feature selection, Machine learning algorithms