Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

DeepAdd: Protein function prediction from k-mer embedding and additional features.

In Computational biology and chemistry

With the application of new high throughput sequencing technology, a large number of protein sequences is becoming available. Determination of the functional characteristics of these proteins by experiments is an expensive endeavor that requires a lot of time. Furthermore, at the organismal level, such kind of experimental functional analyses can be conducted only for a very few selected model organisms. Computational function prediction methods can be used to fill this gap. The functions of proteins are classified by Gene Ontology (GO), which contains more than 40,000 classifications in three domains, Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Additionally, since proteins have many functions, function prediction represents a multi-label and multi-class problem. We developed a new method to predict protein function from sequence. To this end, natural language model was used to generate word embedding of sequence and learn features from it by deep learning, and additional features to locate every protein. Our method uses the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and have noticeable improvement over several algorithms, such as FFPred, DeepGO, GoFDR and other methods compared on the CAFA3 datasets.

Du Zhihua, He Yufeng, Li Jianqiang, Uversky Vladimir N


Convolution neural network, Natural language process, Protein function prediction, Protein-protein interaction network, Sequence similarity profile

General General

Multifaceted impulsivity as a moderator of social anxiety and cannabis use during pregaming.

In Journal of anxiety disorders

Individuals may drink or use cannabis to cope with social anxiety, and drinking or using cannabis prior to social situations (e.g., pregaming) may be a way to limit the experience of anxiety when entering social settings. However, theoretical and empirical work has reported mixed associations between social anxiety and substance use, specifically alcohol and cannabis. Little work has looked at how other variables, such as impulsivity (a central component to high risk drinking such as pregaming), may shed light onto these mixed findings. College students who reported past year pregaming (n = 363) completed self-report surveys. Supporting prior work, we found that social anxiety was associated with fewer pregaming days, even among those high in sensation seeking. However, those reporting higher social anxiety also reported higher cannabis use during pregaming, specifically among those who reported high sensation seeking and high positive urgency. Results suggest specific facets of impulsivity may affect the association between social anxiety and cannabis use during high risk drinking events.

Davis Jordan P, Christie Nina C, Pakdaman Sheila, Hummer Justin F, DeLeon Jessenia, Clapp John D, Pedersen Eric R


Anxiety disorders, College students, Heavy drinking, Impulsivity, Substance use disorder, Young adults

General General

Methodology Minute: A Machine Learning Primer for Infection Prevention and Control.

In American journal of infection control ; h5-index 43.0

The use of machine learning and predictive modeling in infection prevention and control activities is increasing dramatically. In order for infection preventionists to make informed decisions on the performance of any particular model as well as to determine if the output of the model will be useful for their program needs, a suitable understanding of the creation and evaluation of these models is necessary. The purpose of this primer is to introduce the infection preventionist to the most commonly used machine learning method in infection prevention: supervised learning.

Wiemken Timothy L, Rutschman Ana Santos


General General

Automatic Full Conversion of Clinical Terms into SNOMED CT Concepts.

In Journal of biomedical informatics ; h5-index 55.0

SNOMED CT is the most comprehensive clinical ontology and is also amenable for automated reasoning. However, in order to unleash its full potential for automated reasoning over clinical text, a mechanism to convert clinical terms into SNOMED CT concepts is necessary. In this paper we present, to the best of our knowledge, the first such complete conversion method that is also capable of converting clinical terms into post-coordinated concepts which are not already listed in SNOMED CT. The method does not require any additional manual annotations and learns only from existing SNOMED CT terms paired with their concepts. The method is based on identifying the defining relations of the clinical concept expressed by a clinical term. We evaluate our method on a large-scale using existing data from SNOMED CT as well as on a small-scale using manually annotated dataset of clinical terms found in clinical text.

Kate Rohit J


SNOMED CT, clinical terms, machine learning, ontology

Ophthalmology Ophthalmology

Metabolomic profiling of aqueous humor from glaucoma patients - The metabolomics in surgical ophthalmological patients (MISO) study.

In Experimental eye research ; h5-index 43.0

Glaucoma is still a poorly understood disease with a clear need for new biomarkers to help in diagnosis and potentially offer new therapeutic targets. We aimed to determine if the metabolic profile of aqueous humor (AH) as determined by nuclear magnetic resonance (NMR) spectroscopy allows the distinction between primary open-angle glaucoma patients and control subjects, and to distinguish between high-tension (POAG) and normal-tension glaucoma (NTG). We analysed the AH of patients with POAG, NTG and control subjects (n = 30/group). 1H NMR spectra were acquired using a 400 MHz spectrometer. Principle component analysis (PCA), machine learning algorithms and descriptive statistics were applied to analyse the metabolic variance between groups, identify the spectral regions, and hereby potential metabolites that can act as biomarkers for glaucoma. According to PCA, fourteen regions of the NMR spectra were significant in explaining the metabolic variance between the glaucoma and control groups, with no differences found between POAG and NTG groups. These regions were further used in building a classifier for separating glaucoma from control patients, which achieved an AUC of 0.93. Peak integration was performed on these regions and a statistical analysis, after false discovery rate correction and adjustment for the different perioperative topical drug regimen, revealed that five of them were significantly different between groups. The glaucoma group showed a higher content in regions typical for betaine and taurine, possibly linked to neuroprotective mechanisms, and also a higher content in regions that are typical for glutamate, which can indicate damaged neurons and oxidative stress. These results show how aqueous humor metabolomics based on NMR spectroscopy can distinguish glaucoma patients from controls with a high accuracy. Further studies are needed to validate these results in order to incorporate them in clinical practice.

Breda João Barbosa, Sava Anca Croitor, Himmelreich Uwe, Somers Alix, Matthys Christophe, Sousa Amândio Rocha, Vandewalle Evelien, Stalmans Ingeborg


Aqueous humor, Glaucoma, Metabolomics, Nuclear magnetic resonance spectroscopy, Open-angle glaucoma

oncology Oncology

Identification of treatment error types for lung cancer patients using convolutional neural networks and EPID dosimetry.

In Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology

BACKGROUND : /Purpose: Electronic portal imaging device (EPID) dosimetry aims to detect treatment errors, potentially leading to treatment adaptation. Clinically used threshold classification methods for detecting errors lead to loss of information (from multi-dimensional EPID data to a few numbers) and cannot be used for identifying causes of errors. Advanced classification methods, such as deep learning, can use all available information. In this study, convolutional neural networks (CNNs) were trained to detect and identify error type and magnitude of simulated treatment errors in lung cancer patients. The purpose of this simulation study is to provide a proof-of-concept of CNNs for error identification using EPID dosimetry in an in vivo scenario.

MATERIALS AND METHODS : Clinically realistic ranges of anatomical changes, positioning errors and mechanical errors were simulated for lung cancer patients. Predicted portal dose images (PDIs) containing errors were compared to error-free PDIs using the widely used gamma analysis. CNNs were trained to classify errors using 2D gamma maps. Three classification levels were assessed: Level 1 (main error type, e.g., anatomical change), Level 2 (error subtype, e.g., tumor regression) and Level 3 (error magnitude, e.g., >50% tumor regression).

RESULTS : CNNs showed good performance for all classification levels (training/test accuracy 99.5%/96.1%, 92.5%/86.8%, 82.0%/72.9%). For Level 3, overfitting became more apparent.

CONCLUSION : This simulation study indicates that deep learning is a promising powerful tool for identifying types and magnitude of treatment errors with EPID dosimetry, providing additional information not currently available from EPID dosimetry. This is a first step towards rapid, automated models for identification of treatment errors using EPID dosimetry.

Wolfs Cecile J A, Canters Richard A M, Verhaegen Frank


Artificial intelligence, Deep learning, EPID dosimetry, Error detection, Error identification, In vivo dosimetry, Treatment verification