In American journal of respiratory and critical care medicine ; h5-index 108.0
RATIONALE : Cell-free DNA (cfDNA) analysis holds promise for early detection of lung cancer and benefits patients with higher survival. However, the detection sensitivity of previous cfDNA-based studies was still low to suffice the clinical use, especially for early-stage tumors.
OBJECTIVES : Establish an accurate and affordable approach for early-stage lung cancer detection by integrating cfDNA fragmentomics and machine learning models.
METHODS : This study included 350 non-cancer and 432 cancer participants. The participants' plasma cfDNA samples were profiled by whole-genome sequencing. Multiple cfDNA features and machine learning models were compared in the training cohort to achieve an optimal model. Model performance was evaluated in three validation cohorts.
MEASUREMENTS AND MAIN RESULTS : A stacked ensemble model integrating five cfDNA features and five machine learning algorithms constructed in the training cohort (cancer: 113, healthy: 113) outperformed all the models built on individual feature-algorithm combinations. This integrated model yielded superior sensitivities of 91.4% at 95.7% specificity for Cohort Validation I [Area Under the Curve (AUC): 0.984], 84.7% at 98.6% specificity for Validation II (AUC: 0.987), and 92.5% at 94.2% specificity for Additional Validation (AUC: 0.974), respectively. The model's high performance remained consistent when sequencing depth was down to 0.5× (AUC: 0.966-0.971). Furthermore, our model is sensitive to identifying early pathological features (83.2% sensitivity for stage I, 85.0% sensitivity for <1cm tumor at the 0.66 cutoff).
CONCLUSIONS : We have established a stacked ensemble model using cfDNA fragmentomics features and achieved superior sensitivity for detecting early-stage lung cancer, which could promote early diagnosis and benefit more patients.
Wang Siwei, Meng Fanchen, Li Ming, Bao Hua, Chen Xin, Zhu Meng, Liu Rui, Xu Xiuxiu, Yang Shanshan, Wu Xue, Shao Yang, Xu Lin, Yin Rong
2022-Nov-08
Early detection, Lung cancer, Machine learning, Whole genome sequencing, cell-free DNA