In Briefings in bioinformatics
Artificial intelligence (AI) based drug design has demonstrated great potential to fundamentally change the pharmaceutical industries. Currently, a key issue in AI-based drug design is efficient transferable molecular descriptors or fingerprints. Here, we present hypergraph-based molecular topological representation, hypergraph-based (weighted) persistent cohomology (HPC/HWPC) and HPC/HWPC-based molecular fingerprints for machine learning models in drug design. Molecular structures and their atomic interactions are highly complicated and pose great challenges for efficient mathematical representations. We develop the first hypergraph-based topological framework to characterize detailed molecular structures and interactions at atomic level. Inspired by the elegant path complex model, hypergraph-based embedded homology and persistent homology have been proposed recently. Based on them, we construct HPC/HWPC, and use them to generate molecular descriptors for learning models in protein-ligand binding affinity prediction, one of the key step in drug design. Our models are tested on three most commonly-used databases, including PDBbind-v2007, PDBbind-v2013 and PDBbind-v2016, and outperform all existing machine learning models with traditional molecular descriptors. Our HPC/HWPC models have demonstrated great potential in AI-based drug design.
Liu Xiang, Wang Xiangjun, Wu Jie, Xia Kelin
drug design, hypergraph-based persistent cohomology, machine learning, molecular descriptor