Background and objective While the potential of machine learning (ML) in healthcare to positively impact human health continues to grow, the potential for inequity in these methods must be assessed. In this study, we aimed to evaluate the presence of racial bias when five of the most common ML algorithms are used to create models with minimal processing to reduce racial bias. Methods By utilizing a CDC public database, we constructed models for the prediction of healthcare access (binary variable). Using area under the curve (AUC) as our performance metric, we calculated race-specific performance comparisons for each ML algorithm. We bootstrapped our entire analysis 20 times to produce confidence intervals for our AUC performance metrics. Results With the exception of only a few cases, we found that the performance for the White group was, in general, significantly higher than that of the other racial groups across all ML algorithms. Additionally, we found that the most accurate algorithm in our modeling was Extreme Gradient Boosting (XGBoost) followed by random forest, naive Bayes, support vector machine (SVM), and k-nearest neighbors (KNN). Conclusion Our study illustrates the predictive perils of incorporating minimal racial bias mitigation in ML models, resulting in predictive disparities by race. This is particularly concerning in the setting of evidence for limited bias mitigation in healthcare-related ML. There needs to be more conversation, research, and guidelines surrounding methods for racial bias assessment and mitigation in healthcare-related ML models, both those currently used and those in development.
Barton Michael, Hamza Mahmoud, Guevel Borna
data science, health equity, healthcare technology, machine learning, racial bias