ArXiv Preprint
Auditing machine learning-based (ML) healthcare tools for bias is critical to
preventing patient harm, especially in communities that disproportionately face
health inequities. General frameworks are becoming increasingly available to
measure ML fairness gaps between groups. However, ML for health (ML4H) auditing
principles call for a contextual, patient-centered approach to model
assessment. Therefore, ML auditing tools must be (1) better aligned with ML4H
auditing principles and (2) able to illuminate and characterize communities
vulnerable to the most harm. To address this gap, we propose supplementing ML4H
auditing frameworks with SLOGAN (patient Severity-based LOcal Group biAs
detectioN), an automatic tool for capturing local biases in a clinical
prediction task. SLOGAN adapts an existing tool, LOGAN (LOcal Group biAs
detectioN), by contextualizing group bias detection in patient illness severity
and past medical history. We investigate and compare SLOGAN's bias detection
capabilities to LOGAN and other clustering techniques across patient subgroups
in the MIMIC-III dataset. On average, SLOGAN identifies larger fairness
disparities in over 75% of patient groups than LOGAN while maintaining
clustering quality. Furthermore, in a diabetes case study, health disparity
literature corroborates the characterizations of the most biased clusters
identified by SLOGAN. Our results contribute to the broader discussion of how
machine learning biases may perpetuate existing healthcare disparities.
Anaelia Ovalle, Sunipa Dev, Jieyu Zhao, Majid Sarrafzadeh, Kai-Wei Chang
2022-11-16