In BMC bioinformatics ; h5-index 0.0
BACKGROUND : Predicting disease-related genes is helpful for understanding the disease pathology and the molecular mechanisms during the disease progression. However, traditional methods are not suitable for screening genes related to the disease development, because there are some samples with weak label information in the disease dataset and a small number of genes are known disease-related genes.
RESULTS : We designed a disease-related gene mining method based on the weakly supervised learning model in this paper. The method is separated into two steps. Firstly, the differentially expressed genes are screened based on the weakly supervised learning model. In the model, the strong and weak label information at different stages of the disease progression is fully utilized. The obtained differentially expressed gene set is stable and complete after the algorithm converges. Then, we screen disease-related genes in the obtained differentially expressed gene set using transductive support vector machine based on the difference kernel function. The difference kernel function can map the input space of the original Huntington's disease gene expression dataset to the difference space. The relation between the two genes can be evaluated more accurately in the difference space and the known disease-related gene information can be used effectively.
CONCLUSIONS : The experimental results show that the disease-related gene mining method based on the weakly supervised learning model can effectively improve the precision of the disease-related gene prediction compared with other excellent methods.
Zhang Han, Huo Xueting, Guo Xia, Su Xin, Quan Xiongwen, Jin Chen
Differentially expressed genes, Disease-related genes, The difference kernel function, Transductive support vector machine, Weakly supervised learning model