In IEEE journal of biomedical and health informatics

Protein is an essential macro-nutrient for perceiving a wide range of biochemical activities in living cells. A deeper understanding of proteins and their respective functions is key to understand the biological regulations of cells. In this work, we have presented a novel multi-modal approach, named MultiPredGO, for predicting protein functions by utilizing two different kinds of information, namely protein sequence and the protein secondary structure. Here, our contributions are threefold; firstly, along with the protein sequence, we learn the feature representation from the protein structure. Secondly, we develop two different deep learning models after considering the characteristics of the underlying data patterns of the protein sequence and protein 3D structures. Finally, along with these two modalities, we have also utilized protein interaction information for expediting the efficiency of the proposed model in predicting the protein functions. For the underlying modalities, we have utilized various variations of the convolutional neural network for extracting features from them. As the protein function classes are dependent on each other, we have used a neuro-symbolic hierarchical classification model, which resembles the structure of Gene Ontology (GO), for effectively predicting the dependent protein functions. Finally, to validate the goodness of our proposed method (MultiPredGO), we have compared our results with various uni-modal along with two well-known multi-modal protein function prediction approaches, namely, INGA and DeepGO. Results show that the overall performance of the proposed approach in terms of accuracy, F-measure, precision and recall metrics are better than those by the state-of-the-art methods. MultiPredGO attains an average 13.05% and 30.87% improvements over the best existing comparing approach (DeepGO) for cellular component and molecular functions, respectively.

Giri Swagarika Jaharlal, Dutta Pratik, Halani Parth, Saha Sriparna