In bioRxiv : the preprint server for biology
Enhancers and promoters are considered to be bound by a small set of TFs in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays have expanded. Particularly, high-occupancy target (HOT) loci attract hundreds of TFs with seemingly no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used 1,003 TF ChIP-seq datasets in HepG2, K562, and H1 cells to analyze the patterns of ChIP-seq peak co-occurrence combined with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions and determined that HOT promoters regulate housekeeping genes, whereas the HOT enhancers are involved in extremely tissue-specific processes. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of them being ultraconserved regions. Sequence-based classification of HOT loci using deep learning suggests that their formation is driven by sequence features, and the density of ChIP-seq peaks correlates with sequence features. Based on their affinities to bind to promoters and enhancers, we detected five distinct clusters of TFs that form the core of the HOT loci. We also observed that HOT loci are enriched in 3D chromatin hubs and disease-causal variants. In a challenge to the classical model of enhancer activity, we report an abundance of HOT loci in human genome and a commitment of 51% of all ChIP-seq binding events to HOT locus formation and propose a model of HOT locus formation based on the existence of large transcriptional condensates.
Hudaiberdiev Sanjarbek, Ovcharenko Ivan