In European journal of nuclear medicine and molecular imaging ; h5-index 66.0
PURPOSE : Artificial intelligence (AI) has high diagnostic accuracy for coronary artery disease (CAD) from myocardial perfusion imaging (MPI). However, when trained using high-risk populations (such as patients with correlating invasive testing), the disease probability can be overestimated due to selection bias. We evaluated different strategies for training AI models to improve the calibration (accurate estimate of disease probability), using external testing.
METHODS : Deep learning was trained using 828 patients from 3 sites, with MPI and invasive angiography within 6 months. Perfusion was assessed using upright (U-TPD) and supine total perfusion deficit (S-TPD). AI training without data augmentation (model 1) was compared to training with augmentation (increased sampling) of patients without obstructive CAD (model 2), and patients without CAD and TPD < 2% (model 3). All models were tested in an external population of patients with invasive angiography within 6 months (n = 332) or low likelihood of CAD (n = 179).
RESULTS : Model 3 achieved the best calibration (Brier score 0.104 vs 0.121, p < 0.01). Improvement in calibration was particularly evident in women (Brier score 0.084 vs 0.124, p < 0.01). In external testing (n = 511), the area under the receiver operating characteristic curve (AUC) was higher for model 3 (0.930), compared to U-TPD (AUC 0.897) and S-TPD (AUC 0.900, p < 0.01 for both).
CONCLUSION : Training AI models with augmentation of low-risk patients can improve calibration of AI models developed to identify patients with CAD, allowing more accurate assignment of disease probability. This is particularly important in lower-risk populations and in women, where overestimation of disease probability could significantly influence down-stream patient management.
Miller Robert J H, Singh Ananya, Otaki Yuka, Tamarappoo Balaji K, Kavanagh Paul, Parekh Tejas, Hu Lien-Hsin, Gransar Heidi, Sharir Tali, Einstein Andrew J, Fish Mathews B, Ruddy Terrence D, Kaufmann Philipp A, Sinusas Albert J, Miller Edward J, Bateman Timothy M, Dorbala Sharmila, Di Carli Marcelo F, Liang Joanna X, Dey Damini, Berman Daniel S, Slomka Piotr J
Calibration, Deep learning, Diagnostic accuracy, Model training, Sex-specific analysis