Chest radiography has been a recommended procedure for patient triaging and
resource management in intensive care units (ICUs) throughout the COVID-19
pandemic. The machine learning efforts to augment this workflow have been long
challenged due to deficiencies in reporting, model evaluation, and failure mode
analysis. To address some of those shortcomings, we model radiological features
with a human-interpretable class hierarchy that aligns with the radiological
decision process. Also, we propose the use of a data-driven error analysis
methodology to uncover the blind spots of our model, providing further
transparency on its clinical utility. For example, our experiments show that
model failures highly correlate with ICU imaging conditions and with the
inherent difficulty in distinguishing certain types of radiological features.
Also, our hierarchical interpretation and analysis facilitates the comparison
with respect to radiologists' findings and inter-variability, which in return
helps us to better assess the clinical applicability of models.
Shruthi Bannur, Ozan Oktay, Melanie Bernhardt, Anton Schwaighofer, Rajesh Jena, Besmira Nushi, Sharan Wadhwani, Aditya Nori, Kal Natarajan, Shazad Ashraf, Javier Alvarez-Valle, Daniel C. Castro