Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

MFR-DTA: A Multi-Functional and Robust Model for Predicting Drug-Target Binding Affinity and Region.

In Bioinformatics (Oxford, England)

MOTIVATION : Recently, deep learning has become the mainstream methodology for Drug-Target binding Affinity (DTA) prediction. However, two deficiencies of the existing methods restrict their practical applications. On the one hand, most existing methods ignore the individual information of sequence elements, resulting in poor sequence feature representations. On the other hand, without prior biological knowledge, the prediction of drug-target binding regions based on attention weights of a deep neural network could be difficult to verify, which may bring adverse interference to biological researchers.

RESULTS : We propose a novel Multi-Functional and Robust Drug-Target binding Affinity prediction (MFR-DTA) method to address the above issues. Specifically, we design a new biological sequence feature extraction block, namely BioMLP, that assists the model in extracting individual features of sequence elements. Then, we propose a new Elem-feature fusion block to refine the extracted features. After that, we construct a Mix-Decoder block that extracts drug-target interaction information and predicts their binding regions simultaneously. Last, we evaluate MFR-DTA on two benchmarks consistently with the existing methods and propose a new dataset, sc-PDB, to better measure the accuracy of binding region prediction. We also visualise some samples to demonstrate the locations of their binding sites and the predicted multi-scale interaction regions. The proposed method achieves excellent performance on these datasets, demonstrating its merits and superiority over the state-of-the-art methods.


SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.

Hua Yang, Song Xiaoning, Feng Zhenhua, Wu Xiaojun


General General

TIVAN-indel: A computational framework for annotating and predicting noncoding regulatory small insertions and deletions.

In Bioinformatics (Oxford, England)

MOTIVATION : Small insertion and deletion (sindel) of human genome has an important implication for human disease. One important mechanism for noncoding sindel to have an impact on human diseases and phenotypes is through the regulation of gene expression. Nevertheless, current sequencing experiments may lack statistical power and resolution to pinpoint the functional sindel due to lower minor allele frequency or small effect size. As an alternative strategy, a supervised machine learning method can identify the otherwise masked functional sindels by predicting their regulatory potential directly. However, computational methods for annotating and predicting the regulatory sindels, especially in the noncoding regions, are underdeveloped.

RESULTS : By leveraging labelled noncoding sindels identified by cis-expression quantitative trait loci (cis-eQTLs) analyses across 44 tissues in GTEx, and a compilation of both generic functional annotations and large-scale epigenomic profiles, we develop TIVAN-indel, which is a supervised computational framework for predicting noncoding regulatory sindels. As a result, we demonstrate that TIVAN-indel achieves the best prediction performance in both with-tissue prediction and cross-tissue prediction. As an independent evaluation, we train TIVAN-indel from the "Whole Blood" tissue in GTEx and test the model using 15 immune cell types from an independent study named DICE. Lastly, we perform an enrichment analysis for both true and predicted sindels in key regulatory regions such as chromatin interactions, open chromatin regions and histone modification sites, and find biologically meaningful enrichment patterns.


SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.

Agarwal Aman, Zhao Fengdi, Jiang Yuchao, Chen Li


Radiology Radiology

Short-term and Long-term Outcomes of a Disruption and Disconnection of the Pancreatic Duct in Necrotizing Pancreatitis: A Multicenter Cohort Study in 896 Patients.

In The American journal of gastroenterology

INTRODUCTION : Necrotizing pancreatitis may result in a disrupted or disconnected pancreatic duct (DPD) with the potential for long-lasting negative impact on a patient's clinical outcome. There is a lack of detailed data on the full clinical spectrum of DPD, which is critical for the development of better diagnostic and treatment strategies.

METHODS : We performed a long-term post hoc analysis of a prospectively collected nationwide cohort of 896 patients with necrotizing pancreatitis (2005-2015). The median follow-up after hospital admission was 75 months (P25-P75: 41-151). Clinical outcomes of patients with and without DPD were compared using regression analyses, adjusted for potential confounders. Predictive features for DPD were explored.

RESULTS : DPD was confirmed in 243 (27%) of the 896 patients and resulted in worse clinical outcomes during both the patient's initial admission and follow-up. During hospital admission, DPD was associated with an increased rate of new-onset intensive care unit admission (adjusted odds ratio [aOR] 2.52; 95% confidence interval [CI] 1.62-3.93), new-onset organ failure (aOR 2.26; 95% CI 1.45-3.55), infected necrosis (aOR 4.63; 95% CI 2.87-7.64), and pancreatic interventions (aOR 7.55; 95% CI 4.23-13.96). During long-term follow-up, DPD increased the risk of pancreatic intervention (aOR 9.71; 95% CI 5.37-18.30), recurrent pancreatitis (aOR 2.08; 95% CI 1.32-3.29), chronic pancreatitis (aOR 2.73; 95% CI 1.47-5.15), and endocrine pancreatic insufficiency (aOR 1.63; 95% CI 1.05-2.53). Central or subtotal pancreatic necrosis on computed tomography (OR 9.49; 95% CI 6.31-14.29) and a high level of serum C-reactive protein in the first 48 hours after admission (per 10-point increase, OR 1.02; 95% CI 1.00-1.03) were identified as independent predictors for developing DPD.

DISCUSSION : At least 1 of every 4 patients with necrotizing pancreatitis experience DPD, which is associated with detrimental, short-term and long-term interventions, and complications. Central and subtotal pancreatic necrosis and high levels of serum C-reactive protein in the first 48 hours are independent predictors for DPD.

Timmerhuis Hester C, van Dijk Sven M, Hollemans Robbert A, Sperna Weiland Christina J, Umans Devica S, Boxhoorn Lotte, Hallensleben Nora H, van der Sluijs Rogier, Brouwer Lieke, van Duijvendijk Peter, Kager Liesbeth, Kuiken Sjoerd, Poley Jan-Werner, de Ridder Rogier, Römkens Tessa E H, Quispel Rutger, Schwartz Matthijs P, Tan Adriaan C I T L, Venneman Niels G, Vleggaar Frank P, van Wanrooij Roy L J, Witteman Ben J, van Geenen Erwin J, Molenaar I Quintus, Bruno Marco J, van Hooft Jeanin E, Besselink Marc G, Voermans Rogier P, Bollen Thomas L, Verdonk Robert C, van Santvoort Hjalmar C


Public Health Public Health

Rapid and accurate screening of cystic echinococcosis in sheep based on serum Fourier-transform infrared spectroscopy combined with machine learning algorithms.

In Journal of biophotonics

Cystic echinococcosis in sheep is a serious zoonotic parasitic disease caused by Echinococcus granulosus sensu stricto (s.s.). Presently, the screening technology for cystic echinococcosis in sheep is time-consuming and inaccurate, and novel screening technology is urgently needed. In this work, we combined machine learning algorithms with Fourier transform infrared (FT-IR) spectroscopy of serum to establish a quick and accurate screening approach for Cystic echinococcosis in sheep. Serum samples from 77 E. granulosus s.s.-infected sheep, and 121 healthy control sheep were measured by FT-IR spectrometer. To optimize the classification accuracy of the serum FI-TR method for the E. granulosus s.s.-infected sheep and healthy control sheep, Principal component analysis (PCA), Linear discriminant analysis (LDA), and Support vector machine (SVM) algorithms were used to analyze the data. Among all the bands, 1500-1700 cm-1 band has the best classification effect, its diagnostic sensitivity, specificity, and accuracy of PCA-SVM were 100%, 95.74%, and 96.66%, respectively. The study showed that serum FT-IR spectroscopy combined with machine learning algorithms has great potential for rapid and accurate screening methods for the Cystic echinococcosis in sheep. This article is protected by copyright. All rights reserved.

Dawuti Wubulitalifu, Dou Jingrui, Zheng Xiangxiang, Lv Xiaoyi, Zhao Hui, Yang Lingfei, Lin Renyong, Lü Guodong


Cystic echinococcosis in sheep, Fourier transform infrared spectra, Machine learning algorithms, Screening, Serum

General General

Identification and validation of pyroptosis-related gene landscape in prognosis and immunotherapy of ovarian cancer.

In Journal of ovarian research

BACKGROUND : Emerging evidence has highlighted the biological significance of pyroptosis in tumor tumorigenesis and progression. Nonetheless, the potential roles of pyroptosis in tumor immune microenvironment and target therapy of ovarian cancer (OC) remain unknown.

METHODS : In this study, with a series of bioinformatic and machine learning approaches, we comprehensively evaluated genetic alterations and transcriptome profiles of pyroptosis-associated genes (PYAGs) with TCGA-OV datasets. Consensus molecular clustering was performed to determine pyroptosis-associated clusters (PACs) and gene clusters in OC. Subsequently, component analysis algorithm (PCA) was employed to construct Pyrsig score and a highly accurate nomogram was established to evaluate its efficacy. Meanwhile, we systematically performed association analysis for these groups with prognosis, clinical features, TME cell-infiltrating characteristics, drug response and immunotherapeutic efficacy. Immunohistochemistry was conducted to verify molecular expression with clinical samples.

RESULTS : The somatic mutations and copy number variation (CNV) of 51 PYRGs in OC samples were clarified. Two distinct PACs (PAC1/2) and three gene clusters (A/B/C) were identified based on 1332 OC samples, PAC1 and gene cluster A were significantly associated with favorable overall survival (OS), clinicopathological features and TME cell-infiltrating characteristics. Subsequently, Pyrsig score was successfully established to demonstrate the prognostic value and immune characteristics of pyroptosis in OC, low Pyrsig score, characterized by activated immune cell infiltration, indicated prolonged OS, increased sensitivity of some chemotherapeutic drugs and enhanced efficacy of anti-PD-L1 immunotherapy, Consequently, a nomogram was successfully established to improve the clinical applicability and stability of Pyrsig score. With clinical OC samples, GSDMD and GZMB proteins were validated highly expressed in OC and associated with immune infiltration and Pyrsig score, GZMB and CD8 proteins were regarded as independent prognostic factors of OC.

CONCLUSION : This work revealed pyroptosis played a non-negligible role in prognosis value, clinicopathological characteristics and tumor immune infiltration microenvironment in OC, which provided novel insights into identifying and characterizing landscape of tumor immune microenvironment, thereby guiding more effective prognostic evaluation and tailored immunotherapy strategies of OC.

Gao Lingling, Ying Feiquan, Cai Jing, Peng Minggang, Xiao Man, Sun Si, Zeng Ya, Xiong Zhoufang, Cai Liqiong, Gao Rui, Wang Zehua


Immune and immunotherapy, Ovarian cancer, Pyroptosis, Tumor microenvironment

General General

The use of machine learning and deep learning techniques to assess proprioceptive impairments of the upper limb after stroke.

In Journal of neuroengineering and rehabilitation ; h5-index 53.0

BACKGROUND : Robots can generate rich kinematic datasets that have the potential to provide far more insight into impairments than standard clinical ordinal scales. Determining how to define the presence or absence of impairment in individuals using kinematic data, however, can be challenging. Machine learning techniques offer a potential solution to this problem. In the present manuscript we examine proprioception in stroke survivors using a robotic arm position matching task. Proprioception is impaired in 50-60% of stroke survivors and has been associated with poorer motor recovery and longer lengths of hospital stay. We present a simple cut-off score technique for individual kinematic parameters and an overall task score to determine impairment. We then compare the ability of different machine learning (ML) techniques and the above-mentioned task score to correctly classify individuals with or without stroke based on kinematic data.

METHODS : Participants performed an Arm Position Matching (APM) task in an exoskeleton robot. The task produced 12 kinematic parameters that quantify multiple attributes of position sense. We first quantified impairment in individual parameters and an overall task score by determining if participants with stroke fell outside of the 95% cut-off score of control (normative) values. Then, we applied five machine learning algorithms (i.e., Logistic Regression, Decision Tree, Random Forest, Random Forest with Hyperparameters Tuning, and Support Vector Machine), and a deep learning algorithm (i.e., Deep Neural Network) to classify individual participants as to whether or not they had a stroke based only on kinematic parameters using a tenfold cross-validation approach.

RESULTS : We recruited 429 participants with neuroimaging-confirmed stroke (< 35 days post-stroke) and 465 healthy controls. Depending on the APM parameter, we observed that 10.9-48.4% of stroke participants were impaired, while 44% were impaired based on their overall task score. The mean performance metrics of machine learning and deep learning models were: accuracy 82.4%, precision 85.6%, recall 76.5%, and F1 score 80.6%. All machine learning and deep learning models displayed similar classification accuracy; however, the Random Forest model had the highest numerical accuracy (83%). Our models showed higher sensitivity and specificity (AUC = 0.89) in classifying individual participants than the overall task score (AUC = 0.85) based on their performance in the APM task. We also found that variability was the most important feature in classifying performance in the APM task.

CONCLUSION : Our ML models displayed similar classification performance. ML models were able to integrate more kinematic information and relationships between variables into decision making and displayed better classification performance than the overall task score. ML may help to provide insight into individual kinematic features that have previously been overlooked with respect to clinical importance.

Hossain Delowar, Scott Stephen H, Cluff Tyler, Dukelow Sean P


Deep learning, Machine learning, Position sense, Proprioception, Robotics, Stroke