In European radiology ; h5-index 62.0
OBJECTIVES : To evaluate deep neural networks for automatic rib fracture detection on thoracic CT scans and to compare its performance with that of attending-level radiologists using a large amount of datasets from multiple medical institutions.
METHODS : In this retrospective study, an internal dataset of 12,208 emergency room (ER) trauma patients and an external dataset of 1613 ER trauma patients taking chest CT scans were recruited. Two cascaded deep neural networks based on an extended U-Net architecture were developed to segment ribs and detect rib fractures respectively. Model performance was evaluated with a 95% confidence interval (CI) on both the internal and external dataset, and compared with attending-level radiologist readings using t test.
RESULTS : On the internal dataset, the AUC of the model for detecting fractures at per-rib level was 0.970 (95% CI: 0.968, 0.972) with sensitivity of 93.3% (95% CI: 92.0%, 94.4%) at a specificity of 98.4% (95% CI: 98.3%, 98.5%). On the external dataset, the model obtained an AUC of 0.943 (95% CI: 0.941, 0.945) with sensitivity of 86.2% (95% CI: 85.0%, 87.3%) at a specificity of 98.8% (95% CI: 98.7%, 98.9%), compared to the sensitivity of 70.5% (95% CI: 69.3%, 71.8%) (p < .0001) and specificity of 98.8% (95% CI: 98.7%, 98.9%) (p = 0.175) by attending radiologists.
CONCLUSIONS : The proposed DL model is a feasible approach to identify rib fractures on chest CT scans, at the very least, reaching a level on par with attending-level radiologists.
KEY POINTS : • Deep learning-based algorithms automatically detected rib fractures with high sensitivity and reasonable specificity on chest CT scans. • The performance of deep learning-based algorithms reached comparable diagnostic measures with attending level radiologists for rib fracture detection on chest CT scans. • The deep learning models, similar to human readers, were susceptible to the inconspicuity and ambiguity of target lesions. More training data was required for subtle lesions to achieve comparable detection performance.
Wang Shuhao, Wu Dijia, Ye Lifang, Chen Zirong, Zhan Yiqiang, Li Yuehua
Deep learning, Retrospective studies, Rib fractures