In Briefings in bioinformatics
Detecting cancer signals in cell-free DNA (cfDNA) high-throughput sequencing data is emerging as a novel noninvasive cancer detection method. Due to the high cost of sequencing, it is crucial to make robust and precise predictions with low-depth cfDNA sequencing data. Here we propose a novel approach named DISMIR, which can provide ultrasensitive and robust cancer detection by integrating DNA sequence and methylation information in plasma cfDNA whole-genome bisulfite sequencing (WGBS) data. DISMIR introduces a new feature termed as 'switching region' to define cancer-specific differentially methylated regions, which can enrich the cancer-related signal at read-resolution. DISMIR applies a deep learning model to predict the source of every single read based on its DNA sequence and methylation state and then predicts the risk that the plasma donor is suffering from cancer. DISMIR exhibited high accuracy and robustness on hepatocellular carcinoma detection by plasma cfDNA WGBS data even at ultralow sequencing depths. Further analysis showed that DISMIR tends to be insensitive to alterations of single CpG sites' methylation states, which suggests DISMIR could resist to technical noise of WGBS. All these results showed DISMIR with the potential to be a precise and robust method for low-cost early cancer detection.
Li Jiaqi, Wei Lei, Zhang Xianglin, Zhang Wei, Wang Haochen, Zhong Bixi, Xie Zhen, Lv Hairong, Wang Xiaowo
cancer detection, cell-free DNA, deep learning, liquid biopsy, methylation