In Methods (San Diego, Calif.)
In recent years, accumulating studies have shown that long non-coding RNAs (lncRNAs) not only play an important role in the regulation of various biological processes but also are the foundation for understanding mechanisms of human diseases. Due to the high cost of traditional biological experiments, the number of experimentally verified lncRNA-disease associations is very limited. Thus, many computational approaches have been proposed to discover the underlying associations between lncRNAs and diseases. However, the associations between lncRNAs and diseases are too complicated to model by using only traditional matrix factorization-based methods. In this study, we propose a hybrid computational framework (SDLDA) for the lncRNA-disease association prediction. In our computational framework, we use singular value decomposition and deep learning to extract linear and non-linear features of lncRNAs and diseases, respectively. Then we train SDLDA by combing the linear and non-linear features. Compared to previous computational methods, the combination of linear and non-linear features reinforces each other, which is better than using only either matrix factorization or deep learning. The computational results show that SDLDA has a better performance over existing methods in the leave-one-out cross-validation. Furthermore, the case studies show that 28 out of 30 cancer-related lncRNAs (10 for gastric cancer, 10 for colon cancer and 8 for renal cancer) are verified by mining recent biomedical literature. Code and data can be accessed at https://github.com/CSUBioGroup/SDLDA.
Zeng Min, Lu Chengqian, Zhang Fuhao, Li Yiming, Wu Fang-Xiang, Li Yaohang, Li Min
deep learning, linear feature, lncRNA-disease association prediction, non-linear feature, singular value decomposition