In Bioinformatics (Oxford, England)

MOTIVATION : Human microbes get closely involved in an extensive variety of complex human diseases and become new drug targets. In silico methods for identifying potential microbe-drug associations provide an effective complement to conventional experimental methods, which can not only benefit screening candidate compounds for drug development but also facilitate novel knowledge discovery for understanding microbe-drug interaction mechanisms. On the other hand, the recent increased availability of accumulated biomedical data for microbes and drugs provides a great opportunity for a machine learning approach to predict microbe-drug associations. We are thus highly motivated to integrate these data sources to improve prediction accuracy. In addition, it is extremely challenging to predict interactions for new drugs or new microbes, which have no existing microbe-drug associations.

RESULTS : In this work, we leverage various sources of biomedical information and construct multiple networks (graphs) for microbes and drugs. Then, we develop a novel ensemble framework of graph attention networks with a hierarchical attention mechanism for microbe-drug association prediction from the constructed multiple microbe-drug graphs, denoted as EGATMDA. In particular, for each input graph, we design a graph convolutional network with node-level attention to learn embeddings for nodes (i.e. microbes and drugs). To effectively aggregate node embeddings from multiple input graphs, we implement graph-level attention to learn the importance of different input graphs. Experimental results under different cross-validation settings (e.g. the setting for predicting associations for new drugs) showed that our proposed method outperformed seven state-of-the-art methods. Case studies on predicted microbe-drug associations further demonstrated the effectiveness of our proposed EGATMDA method.

AVAILABILITY : Source codes and supplementary materials are available at:

SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.

Long Yahui, Wu Min, Liu Yong, Kwoh Chee Keong, Luo Jiawei, Li Xiaoli