In PLoS computational biology
DNA topoisomerase II-β (TOP2B) is fundamental to remove topological problems linked to DNA metabolism and 3D chromatin architecture, but its cut-and-reseal catalytic mechanism can accidentally cause DNA double-strand breaks (DSBs) that can seriously compromise genome integrity. Understanding the factors that determine the genome-wide distribution of TOP2B is therefore not only essential for a complete knowledge of genome dynamics and organization, but also for the implications of TOP2-induced DSBs in the origin of oncogenic translocations and other types of chromosomal rearrangements. Here, we conduct a machine-learning approach for the prediction of TOP2B binding using publicly available sequencing data. We achieve highly accurate predictions, with accessible chromatin and architectural factors being the most informative features. Strikingly, TOP2B is sufficiently explained by only three features: DNase I hypersensitivity, CTCF and cohesin binding, for which genome-wide data are widely available. Based on this, we develop a predictive model for TOP2B genome-wide binding that can be used across cell lines and species, and generate virtual probability tracks that accurately mirror experimental ChIP-seq data. Our results deepen our knowledge on how the accessibility and 3D organization of chromatin determine TOP2B function, and constitute a proof of principle regarding the in silico prediction of sequence-independent chromatin-binding factors.
Martínez-García Pedro Manuel, García-Torres Miguel, Divina Federico, Terrón-Bautista José, Delgado-Sainz Irene, Gómez-Vela Francisco, Cortés-Ledesma Felipe