In Statistics in medicine
Spectral graph convolutional neural networks (GCN) are proposed to incorporate important information contained in graphs such as gene networks. In a standard spectral GCN, there is only one gene network to describe the relationships among genes. However, for genomic applications, due to condition- or tissue-specific gene function and regulation, multiple gene networks may be available; it is unclear how to apply GCNs to disease classification with multiple networks. Besides, which gene networks may provide more effective prior information for a given learning task is unknown a priori and is not straightforward to discover in many cases. A deep multiple graph convolutional neural network is therefore developed here to meet the challenge. The new approach not only computes a feature of a gene as the weighted average of those of itself and its neighbors through spectral GCNs, but also extracts features from gene-specific expression (or other feature) profiles via a feed-forward neural networks (FNN). We also provide two measures, the importance of a given gene and the relative importance score of each gene network, for the genes' and gene networks' contributions, respectively, to the learning task. To evaluate the new method, we conduct real data analyses using several breast cancer and diffuse large B-cell lymphoma datasets and incorporating multiple gene networks obtained from "GIANT 2.0" Compared with the standard FNN, GCN, and random forest, the new method not only yields high classification accuracy but also prioritizes the most important genes confirmed to be highly associated with cancer, strongly suggesting the usefulness of the new method in incorporating multiple gene networks.
Yang Hu, Zhuang Zhong, Pan Wei
Laplacian, deep learning, feed-forward neural network, gene expression data, spectral graph theory