In Briefings in bioinformatics
AlphaFold2 achieved a breakthrough in protein structure prediction through the end-to-end deep learning method, which can predict nearly all single-domain proteins at experimental resolution. However, the prediction accuracy of full-chain proteins is generally lower than that of single-domain proteins because of the incorrect interactions between domains. In this work, we develop an inter-domain distance prediction method, named DeepIDDP. In DeepIDDP, we design a neural network with attention mechanisms, where two new inter-domain features are used to enhance the ability to capture the interactions between domains. Furthermore, we propose a data enhancement strategy termed DPMSA, which is employed to deal with the absence of co-evolutionary information on targets. We integrate DeepIDDP into our previously developed domain assembly method SADA, termed SADA-DeepIDDP. Tested on a given multi-domain benchmark dataset, the accuracy of SADA-DeepIDDP inter-domain distance prediction is 11.3% and 21.6% higher than trRosettaX and trRosetta, respectively. The accuracy of the domain assembly model is 2.5% higher than that of SADA. Meanwhile, we reassemble 68 human multi-domain protein models with TM-score ≤ 0.80 from the AlphaFold protein structure database, where the average TM-score is improved by 11.8% after the reassembly by our method. The online server is at http://zhanglab-bioinf.com/DeepIDDP/.
Ge Fengqi, Peng Chunxiang, Cui Xinyue, Xia Yuhao, Zhang Guijun
2023-Mar-15
deep learning, domain assembly, inter-domain distance, multi-domain protein