In BMC ecology and evolution
BACKGROUND : The hepatitis B virus (HBV) is one of the main causes of viral hepatitis and liver cancer. HBV integration is one of the key steps in the virus-promoted malignant transformation.
RESULTS : An attention-based deep learning model, DeepHBV, was developed to predict HBV integration sites. By learning local genomic features automatically, DeepHBV was trained and tested using HBV integration site data from the dsVIS database. Initially, DeepHBV showed an AUROC of 0.6363 and an AUPR of 0.5471 for the dataset. The integration of genomic features of repeat peaks and TCGA Pan-Cancer peaks significantly improved model performance, with AUROCs of 0.8378 and 0.9430 and AUPRs of 0.7535 and 0.9310, respectively. The transcription factor binding sites (TFBS) were significantly enriched near the genomic positions that were considered. The binding sites of the AR-halfsite, Arnt, Atf1, bHLHE40, bHLHE41, BMAL1, CLOCK, c-Myc, COUP-TFII, E2A, EBF1, Erra, and Foxo3 were highlighted by DeepHBV in both the dsVIS and VISDB datasets, revealing a novel integration preference for HBV.
CONCLUSIONS : DeepHBV is a useful tool for predicting HBV integration sites, revealing novel insights into HBV integration-related carcinogenesis.
Wu Canbiao, Guo Xiaofang, Li Mengyuan, Shen Jingxian, Fu Xiayu, Xie Qingyu, Hou Zeliang, Zhai Manman, Qiu Xiaofan, Cui Zifeng, Xie Hongxian, Qin Pengmin, Weng Xuchu, Hu Zheng, Liang Jiuxing
Bioinformatics, Deep learning, Genomic features, HBV integration sites