Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

TIS Transformer: remapping the human proteome using deep learning.

In NAR genomics and bioinformatics
The correct mapping of the proteome is an important step towards advancing our understanding of biological systems and cellular mechanisms. Methods that provide better mappings can fuel important processes such as drug discovery and disease understanding. Currently, true determination of translation initiation sites is primarily achieved by in vivo experiments. Here, we propose TIS Transformer, a deep learning model for the determination of translation start sites solely utilizing the information embedded in the transcript nucleotide sequence. The method is built upon deep learning techniques first designed for natural language processing. We prove this approach to be best suited for learning the semantics of translation, outperforming previous approaches by a large margin. We demonstrate that limitations in the model performance are primarily due to the presence of low-quality annotations against which the model is evaluated against. Advantages of the method are its ability to detect key features of the translation process and multiple coding sequences on a transcript. These include micropeptides encoded by short Open Reading Frames, either alongside a canonical coding sequence or within long non-coding RNAs. To demonstrate the use of our methods, we applied TIS Transformer to remap the full human proteome.
Clauwaert Jim, McVey Zahra, Gupta Ramneek, Menschaert Gerben

2023-Mar

General

General

Machine learning reveals STAT motifs as predictors for GR-mediated gene repression.

In Computational and structural biotechnology journal
Glucocorticoids are potent immunosuppressive drugs, but long-term treatment leads to severe side-effects. While there is a commonly accepted model for GR-mediated gene activation, the mechanism behind repression remains elusive. Understanding the molecular action of the glucocorticoid receptor (GR) mediated gene repression is the first step towards developing novel therapies. We devised an approach that combines multiple epigenetic assays with 3D chromatin data to find sequence patterns predicting gene expression change. We systematically tested> 100 models to evaluate the best way to integrate the data types and found that GR-bound regions hold most of the information needed to predict the polarity of Dex-induced transcriptional changes. We confirmed NF-κB motif family members as predictors for gene repression and identified STAT motifs as additional negative predictors.
Höllbacher Barbara, Strickland Benjamin, Greulich Franziska, Uhlenhaut N Henriette, Heinig Matthias

2023

ChIPseq, ChIPseq, chromatin immunoprecipitation sequencing, Epigenomics, Glucocorticoid receptor, Machine-learning, RNAseq, Repression, STAT, STAT, signal transducer and activator of transcription

oncology

Oncology

AI-DrugNet: A network-based deep learning model for drug repurposing and combination therapy in neurological disorders.

In Computational and structural biotechnology journal
Discovering effective therapies is difficult for neurological and developmental disorders in that disease progression is often associated with a complex and interactive mechanism. Over the past few decades, few drugs have been identified for treating Alzheimer's disease (AD), especially for impacting the causes of cell death in AD. Although drug repurposing is gaining more success in developing therapeutic efficacy for complex diseases such as common cancer, the complications behind AD require further study. Here, we developed a novel prediction framework based on deep learning to identify potential repurposed drug therapies for AD, and more importantly, our framework is broadly applicable and may generalize to identifying potential drug combinations in other diseases. Our prediction framework is as follows: we first built a drug-target pair (DTP) network based on multiple drug features and target features, as well as the associations between DTP nodes where drug-target pairs are the DTP nodes and the associations between DTP nodes are represented as the edges in the AD disease network; furthermore, we incorporated the drug-target feature from the DTP network and the relationship information between drug-drug, target-target, drug-target within and outside of drug-target pairs, representing each drug-combination as a quartet to generate corresponding integrated features; finally, we developed an AI-based Drug discovery Network (AI-DrugNet), which exhibits robust predictive performance. The implementation of our network model help identify potential repurposed and combination drug options that may serve to treat AD and other diseases.
Pan Xingxin, Yun Jun, Coban Akdemir Zeynep H, Jiang Xiaoqian, Wu Erxi, Huang Jason H, Sahni Nidhi, Yi S Stephen

2023

Deep learning, Drug combination therapy, Drug repurposing, Network model, Neurological and developmental disorders

General

General

Interpretation of lung disease classification with light attention connected module.

In Biomedical signal processing and control
Lung diseases lead to complications from obstructive diseases, and the COVID-19 pandemic has increased lung disease-related deaths. Medical practitioners use stethoscopes to diagnose lung disease. However, an artificial intelligence model capable of objective judgment is required since the experience and diagnosis of respiratory sounds differ. Therefore, in this study, we propose a lung disease classification model that uses an attention module and deep learning. Respiratory sounds were extracted using log-Mel spectrogram MFCC. Normal and five types of adventitious sounds were effectively classified by improving VGGish and adding a light attention connected module to which the efficient channel attention module (ECA-Net) was applied. The performance of the model was evaluated for accuracy, precision, sensitivity, specificity, f1-score, and balanced accuracy, which were 92.56%, 92.81%, 92.22%, 98.50%, 92.29%, and 95.4%, respectively. We confirmed high performance according to the attention effect. The classification causes of lung diseases were analyzed using gradient-weighted class activation mapping (Grad-CAM), and the performances of their models were compared using open lung sounds measured using a Littmann 3200 stethoscope. The experts' opinions were also included. Our results will contribute to the early diagnosis and interpretation of diseases in patients with lung disease by utilizing algorithms in smart medical stethoscopes.
Choi Youngjin, Lee Hongchul

2023-Jul

Attention, ECA-Net, Grad-CAM, Lung disease, Respiratory sound, eXplainable AI

Radiology

Radiology

Machine learning prediction for COVID-19 disease severity at hospital admission.

In BMC medical informatics and decision making ; h5-index 38.0

IMPORTANCE : Early prognostication of patients hospitalized with COVID-19 who may require mechanical ventilation and have worse outcomes within 30 days of admission is useful for delivering appropriate clinical care and optimizing resource allocation.

OBJECTIVE : To develop machine learning models to predict COVID-19 severity at the time of the hospital admission based on a single institution data.

DESIGN, SETTING, AND PARTICIPANTS : We established a retrospective cohort of patients with COVID-19 from University of Texas Southwestern Medical Center from May 2020 to March 2022. Easily accessible objective markers including basic laboratory variables and initial respiratory status were assessed using Random Forest's feature importance score to create a predictive risk score. Twenty-five significant variables were identified to be used in classification models. The best predictive models were selected with repeated tenfold cross-validation methods.

MAIN OUTCOMES AND MEASURES : Among patients with COVID-19 admitted to the hospital, severity was defined by 30-day mortality (30DM) rates and need for mechanical ventilation.

RESULTS : This was a large, single institution COVID-19 cohort including total of 1795 patients. The average age was 59.7 years old with diverse heterogeneity. 236 (13%) required mechanical ventilation and 156 patients (8.6%) died within 30 days of hospitalization. Predictive accuracy of each predictive model was validated with the 10-CV method. Random Forest classifier for 30DM model had 192 sub-trees, and obtained 0.72 sensitivity and 0.78 specificity, and 0.82 AUC. The model used to predict MV has 64 sub-trees and returned obtained 0.75 sensitivity and 0.75 specificity, and 0.81 AUC. Our scoring tool can be accessed at https://faculty.tamuc.edu/mmete/covid-risk.html .

CONCLUSIONS AND RELEVANCE : In this study, we developed a risk score based on objective variables of COVID-19 patients within six hours of admission to the hospital, therefore helping predict a patient's risk of developing critical illness secondary to COVID-19.

Raman Ganesh, Ashraf Bilal, Demir Yusuf Kemal, Kershaw Corey D, Cheruku Sreekanth, Atis Murat, Atis Ahsen, Atar Mustafa, Chen Weina, Ibrahim Ibrahim, Bat Taha, Mete Mutlu

2023-Mar-07

COVID-19, Classification, Laboratory markers, Machine learning, Prediction, SARS-CoV-2, Scoring

Radiology

Radiology

Machine learning prediction for COVID-19 disease severity at hospital admission.

In BMC medical informatics and decision making ; h5-index 38.0

IMPORTANCE : Early prognostication of patients hospitalized with COVID-19 who may require mechanical ventilation and have worse outcomes within 30 days of admission is useful for delivering appropriate clinical care and optimizing resource allocation.

OBJECTIVE : To develop machine learning models to predict COVID-19 severity at the time of the hospital admission based on a single institution data.

DESIGN, SETTING, AND PARTICIPANTS : We established a retrospective cohort of patients with COVID-19 from University of Texas Southwestern Medical Center from May 2020 to March 2022. Easily accessible objective markers including basic laboratory variables and initial respiratory status were assessed using Random Forest's feature importance score to create a predictive risk score. Twenty-five significant variables were identified to be used in classification models. The best predictive models were selected with repeated tenfold cross-validation methods.

MAIN OUTCOMES AND MEASURES : Among patients with COVID-19 admitted to the hospital, severity was defined by 30-day mortality (30DM) rates and need for mechanical ventilation.

RESULTS : This was a large, single institution COVID-19 cohort including total of 1795 patients. The average age was 59.7 years old with diverse heterogeneity. 236 (13%) required mechanical ventilation and 156 patients (8.6%) died within 30 days of hospitalization. Predictive accuracy of each predictive model was validated with the 10-CV method. Random Forest classifier for 30DM model had 192 sub-trees, and obtained 0.72 sensitivity and 0.78 specificity, and 0.82 AUC. The model used to predict MV has 64 sub-trees and returned obtained 0.75 sensitivity and 0.75 specificity, and 0.81 AUC. Our scoring tool can be accessed at https://faculty.tamuc.edu/mmete/covid-risk.html .

CONCLUSIONS AND RELEVANCE : In this study, we developed a risk score based on objective variables of COVID-19 patients within six hours of admission to the hospital, therefore helping predict a patient's risk of developing critical illness secondary to COVID-19.

Raman Ganesh, Ashraf Bilal, Demir Yusuf Kemal, Kershaw Corey D, Cheruku Sreekanth, Atis Murat, Atis Ahsen, Atar Mustafa, Chen Weina, Ibrahim Ibrahim, Bat Taha, Mete Mutlu

2023-Mar-07

COVID-19, Classification, Laboratory markers, Machine learning, Prediction, SARS-CoV-2, Scoring