In International journal of chronic obstructive pulmonary disease ; h5-index 50.0
Background : Chronic obstructive pulmonary disease (COPD), the third leading cause of death worldwide, is often underdiagnosed.
Purpose : To develop machine learning methods to predict COPD using chest radiographs and a convolutional neural network (CNN) trained with near-concurrent pulmonary function test (PFT) data. Comparison is made to natural language processing (NLP) of the associated radiologist text reports.
Materials and Methods : This IRB-approved single-institution retrospective study uses 6749 two-view chest radiograph exams (2012-2017, 4436 unique subjects, 54% female, 46% male), same-day associated radiologist text reports, and PFT exams acquired within 180 days. The Image Model (Resnet18 pre-trained with ImageNet CNN) is trained using frontal and lateral radiographs and PFTs with 10% of the subjects for validation and 19% for testing. The NLP Model is trained using radiologist text reports and PFTs. The primary metric of model comparison is the area under the receiver operating characteristic curve (AUC).
Results : The Image Model achieves an AUC of 0.814 for prediction of obstructive lung disease (FEV1/FVC <0.7) from chest radiographs and performs better than the NLP Model (AUC 0.704, p<0.001) from radiologist text reports where FEV1 = forced expiratory volume in 1 second and FVC = forced vital capacity. The Image Model performs better for prediction of severe or very severe COPD (FEV1 <0.5) with an AUC of 0.837 versus the NLP model AUC of 0.770 (p<0.001).
Conclusion : A CNN Image Model trained on physiologic lung function data (PFTs) can be applied to chest radiographs for quantitative prediction of obstructive lung disease with good accuracy.
Schroeder Joyce D, Bigolin Lanfredi Ricardo, Li Tao, Chan Jessica, Vachet Clement, Paine Iii Robert, Srikumar Vivek, Tasdizen Tolga
chronic obstructive pulmonary disease, machine learning, natural language processing, quantitative image analysis