Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

DeepA-RBPBS: A hybrid convolution and recurrent neural network combined with attention mechanism for predicting RBP binding site.

In Journal of biomolecular structure & dynamics

It's important to infer the binding site of RNA-binding proteins (RBP) for understanding the interaction between RBP and its RNA targets and decipher the mechanisms of transcriptional regulation. However, experimental detection of RBP binding sites is still time-intensive and expensive. Algorithms based on machine learning can speed up detection of RBP binding sites. In this article, we propose a new deep learning method, DeepA-RBPBS, which can use RNA sequences and structural features to predict RBP binding site. DeepA-RBPBS uses CNN and BiGRU to extract sequences and structural features without long-term dependence issues. It also utilizes an attention mechanism to enhance the contribution of key features. The comparison shows that the performance of DeepA-RBPBS is better than that of the state-of-the-art predictors. In the testing on 31 datasets of CLIP-seq experiments over 19 proteins, MCC (AUC) is 8% (5%) higher than those of the latest method based on deep learning, iDeepS. We also apply DeepA-RBPBS to the target RNA data of RBPs related to diabetes (LIN28, RBFOX2, FTO, IGF2BP2, CELF1 and HuR). The results show that DeepA-RBPBS correctly predicted 41,693 samples, where iDeepS predicted 31,381 samples. Communicated by Ramaswamy H. Sarma.

Du Zhihua, Xiao Xiangdong, Uversky Vladimir N

2020-Dec-04

CLIP-seq, RNA-binding proteins, attention mechanism, deep learning

General General

Techniques assisting peptide vaccine and peptidomimetic design. Sidechain exposure in the SARS-CoV-2 spike glycoprotein.

In Computers in biology and medicine

The aim of the present study is to discuss the design of peptide vaccines and peptidomimetics against SARS-COV-2, to develop and apply a method of protein structure analysis that is particularly appropriate to applying and discussing such design, and also to use that method to summarize some important features of the SARS-COV-2 spike protein sequence. A tool for assessing sidechain exposure in the SARS-CoV-2 spike glycoprotein is described. It extends to assessing accessibility of sidechains by considering several different three-dimensional structure determinations of SARS-CoV-2 and SARS-CoV-1 spike protein. The method is designed to be insensitive to a distance limit for counting neighboring atoms and the results are in good agreement with the physical chemical properties and exposure trends of the 20 naturally occurring sidechains. The spike protein sequence is analyzed with comment regarding exposable character. It includes studies of complexes with antibody elements and ACE2. These indicate changes in exposure at sites remote to those at which the antibody binds. They are of interest concerning design of synthetic peptide vaccines, and for peptidomimetics as a basis of drug discovery. The method was also developed in order to provide linear (one-dimensional) information that can be used along with other bioinformatics data of this kind in data mining and machine learning, potentially as genomic data regarding protein polymorphisms to be combined with more traditional clinical data.

Robson B

2020-Nov-21

Accessibility, COVID-19, Conformation, Coronavirus, Disorder, Exposure, Glycosylation, SARS-CoV-2, Spike glycoprotein

General General

Recent advances in network-based methods for disease gene prediction.

In Briefings in bioinformatics

Disease-gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative and complementary low-cost disease-gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease-gene association prediction. In this survey, we aim to provide a comprehensive and up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14 state-of-the-art methods. To summarize, we first elucidate the task definition for disease gene prediction. Secondly, we categorize existing network-based efforts into network diffusion methods, traditional machine learning methods with handcrafted graph features and graph representation learning methods. Thirdly, an empirical analysis is conducted to evaluate the performance of the selected methods across seven diseases. We also provide distinguishing findings about the discussed methods based on our empirical analysis. Finally, we highlight potential research directions for future studies on disease gene prediction.

Ata Sezin Kircali, Wu Min, Fang Yuan, Ou-Yang Le, Kwoh Chee Keong, Li Xiao-Li

2020-Dec-05

disease gene prediction, graph representation learning, network-based methods

General General

Techniques assisting peptide vaccine and peptidomimetic design. Sidechain exposure in the SARS-CoV-2 spike glycoprotein.

In Computers in biology and medicine

The aim of the present study is to discuss the design of peptide vaccines and peptidomimetics against SARS-COV-2, to develop and apply a method of protein structure analysis that is particularly appropriate to applying and discussing such design, and also to use that method to summarize some important features of the SARS-COV-2 spike protein sequence. A tool for assessing sidechain exposure in the SARS-CoV-2 spike glycoprotein is described. It extends to assessing accessibility of sidechains by considering several different three-dimensional structure determinations of SARS-CoV-2 and SARS-CoV-1 spike protein. The method is designed to be insensitive to a distance limit for counting neighboring atoms and the results are in good agreement with the physical chemical properties and exposure trends of the 20 naturally occurring sidechains. The spike protein sequence is analyzed with comment regarding exposable character. It includes studies of complexes with antibody elements and ACE2. These indicate changes in exposure at sites remote to those at which the antibody binds. They are of interest concerning design of synthetic peptide vaccines, and for peptidomimetics as a basis of drug discovery. The method was also developed in order to provide linear (one-dimensional) information that can be used along with other bioinformatics data of this kind in data mining and machine learning, potentially as genomic data regarding protein polymorphisms to be combined with more traditional clinical data.

Robson B

2020-Nov-21

Accessibility, COVID-19, Conformation, Coronavirus, Disorder, Exposure, Glycosylation, SARS-CoV-2, Spike glycoprotein

Surgery Surgery

Deep learning enabled prediction of 5-year survival in pediatric genitourinary rhabdomyosarcoma.

In Surgical oncology

BACKGROUND : Genitourinary rhabdomyosarcoma (GU-RMS) is a rare, pediatric malignancy originating from embryonic mesenchyme. Current approaches to prognostication rely upon conventional statistical methods such as Cox proportional hazards (CPH) models and have suboptimal predictive ability. Given the success of deep learning approaches in other specialties, we sought to develop and compare deep learning models with CPH models for the prediction of 5-year survival in pediatric GU-RMS patients.

METHODS : Patients less than 20 years of age with GU-RMS were identified within the Surveillance, Epidemiology, and End Results (SEER) database (1998-2011). Deep neural networks (DNN) were trained and tested on an 80/20 split of the dataset in a 5-fold cross-validated fashion. Multivariable CPH models were developed in parallel. The primary outcomes were 5-year overall survival (OS) and disease-specific survival (DSS). Variables used for prediction were age, sex, race, primary site, histology, degree of tumor extension, tumor size, receipt of surgery, and receipt of radiation. Receiver operating characteristic curve analysis was conducted, and DNN models were tested for calibration.

RESULTS : 277 patients were included. The area under the curve (AUC) for the DNN models was 0.93 for OS and 0.91 for DSS. AUC for the CPH models was 0.82 for OS and 0.84 for DSS. The DNN models were well-calibrated: OS model (slope = 1.02, intercept = -0.06) and DSS model (slope = 0.79, intercept = 0.21).

CONCLUSIONS : A deep learning-based model demonstrated excellent performance, superior to that of CPH models, in the prediction of pediatric GU-RMS survival. Deep learning approaches may enable improved prognostication for patients with rare cancers.

Bhambhvani Hriday P, Zamora Alvaro, Velaer Kyla, Greenberg Daniel R, Sheth Kunj R

2020-Nov-20

Artificial intelligence, Machine learning, Prognostication, Sarcoma, Urogenital

Radiology Radiology

Compressed sensing and deep learning reconstruction for women's pelvic MRI denoising: Utility for improving image quality and examination time in routine clinical practice.

In European journal of radiology ; h5-index 47.0

PURPOSE : To demonstrate the utility of compressed sensing with parallel imaging (Compressed SPEEDER) and AiCE compared with that of conventional parallel imaging (SPEEDER) for shortening examination time and improving image quality of women's pelvic MRI.

METHOD : Thirty consecutive patients with women's pelvic diseases (mean age 50 years) underwent T2-weighted imaging using Compressed SPEEDER as well as conventional SPEEDER reconstructed with and without AiCE. The examination times were recorded, and signal-to-noise ratio (SNR) was calculated for every patient. Moreover, overall image quality was assessed using a 5-point scoring system, and final scores for all patients were determined by consensus of two readers. Mean examination time, SNR and overall image quality were compared among the four data sets by Wilcoxon signed-rank test.

RESULTS : Examination times for Compressed SPEEDER with and without AiCE were significantly shorter than those for conventional SPEEDER with and without AiCE (with AiCE: p < 0.0001, without AiCE: p < 0.0001). SNR of Compressed SPEEDER and of SPEEDER with AiCE was significantly superior to that of Compressed SPEEDER without AiCE (vs. Compressed SPEEDER, p = 0.01; vs. SPEEDER, p = 0.009). Overall image quality of Compressed SPEEDER with AiCE and of SPEEDER with and without AiCE was significantly higher than that of Compressed SPEEDER without AiCE (vs. Compressed SPEEDER with AiCE, p < 0.0001; vs. SPEEDER with AiCE, p < 0.0001; SPEEDER without AiCE, p = 0.0003).

CONCLUSION : Image quality and shorten examination time for T2-weighted imaging in women's pelvic MRI can be significantly improved by using Compressed SPEEDER with AiCE in comparison with conventional SPEEDER, although other sequences were not tested.

Ueda Takahiro, Ohno Yoshiharu, Yamamoto Kaori, Iwase Akiyoshi, Fukuba Takashi, Hanamatsu Satomu, Obama Yuki, Ikeda Hirotaka, Ikedo Masato, Yui Masao, Murayama Kazuhiro, Toyama Hiroshi

2020-Nov-21

Compressed sensing, Deep learning, MRI, Parallel imaging, Women’s imaging