Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Establishing the accuracy of density functional approaches for the description of noncovalent interactions in biomolecules.

In Physical chemistry chemical physics : PCCP

Biomolecules have complex structures, and noncovalent interactions are crucial to determine their conformations and functionalities. It is therefore critical to be able to describe them in an accurate but efficient manner in these systems. In this context density functional theory (DFT) could provide a powerful tool to simulate biological matter either directly for relatively simple systems or coupled with classical simulations like the QM/MM (quantum mechanics/molecular mechanics) approach. Additionally, DFT could play a fundamental role to fit the parameters of classical force fields or to train machine learning potentials to perform large scale molecular dynamics simulations of biological systems. Yet, local or semi-local approximations used in DFT cannot describe van der Waals (vdW) interactions, one of the essential noncovalent interactions in biomolecules, since they lack a proper description of long range correlation effects. However, many efficient and reasonably accurate methods are now available for the description of van der Waals interactions within DFT. In this work, we establish the accuracy of several state-of-the-art vdW-aware functionals by considering 275 biomolecules including interacting DNA and RNA bases, peptides and biological inhibitors and compare our results for the energy with highly accurate wavefunction based calculations. Most methods considered here can achieve close to predictive accuracy. In particular, the non-local vdW-DF2 functional is revealed to be the best performer for biomolecules, while among the vdW-corrected DFT methods, uMBD is also recommended as a less accurate but faster alternative.

Kim Minho, Gould Tim, Rocca Dario, Lebègue Sébastien


Radiology Radiology

A Quality Control System for Automated Prostate Segmentation on T2-Weighted MRI.

In Diagnostics (Basel, Switzerland)

Computer-aided detection and diagnosis (CAD) systems have the potential to improve robustness and efficiency compared to traditional radiological reading of magnetic resonance imaging (MRI). Fully automated segmentation of the prostate is a crucial step of CAD for prostate cancer, but visual inspection is still required to detect poorly segmented cases. The aim of this work was therefore to establish a fully automated quality control (QC) system for prostate segmentation based on T2-weighted MRI. Four different deep learning-based segmentation methods were used to segment the prostate for 585 patients. First order, shape and textural radiomics features were extracted from the segmented prostate masks. A reference quality score (QS) was calculated for each automated segmentation in comparison to a manual segmentation. A least absolute shrinkage and selection operator (LASSO) was trained and optimized on a randomly assigned training dataset (N = 1756, 439 cases from each segmentation method) to build a generalizable linear regression model based on the radiomics features that best estimated the reference QS. Subsequently, the model was used to estimate the QSs for an independent testing dataset (N = 584, 146 cases from each segmentation method). The mean ± standard deviation absolute error between the estimated and reference QSs was 5.47 ± 6.33 on a scale from 0 to 100. In addition, we found a strong correlation between the estimated and reference QSs (rho = 0.70). In conclusion, we developed an automated QC system that may be helpful for evaluating the quality of automated prostate segmentations.

Sunoqrot Mohammed R S, Selnæs Kirsten M, Sandsmark Elise, Nketiah Gabriel A, Zavala-Romero Olmo, Stoyanova Radka, Bathen Tone F, Elschot Mattijs


MRI, computer-aided detection and diagnosis, deep learning, machine learning, prostate, quality control, radiomics, segmentation

General General

Two-Level LSTM for Sentiment Analysis With Lexicon Embedding and Polar Flipping.

In IEEE transactions on cybernetics

Sentiment analysis is a key component in various text mining applications. Numerous sentiment classification techniques, including conventional and deep-learning-based methods, have been proposed in the literature. In most existing methods, a high-quality training set is assumed to be given. Nevertheless, constructing a high-quality training set that consists of highly accurate labels is challenging in real applications. This difficulty stems from the fact that text samples usually contain complex sentiment representations, and their annotation is subjective. We address this challenge in this study by leveraging a new labeling strategy and utilizing a two-level long short-term memory network to construct a sentiment classifier. Lexical cues are useful for sentiment analysis, and they have been utilized in conventional studies. For example, polar and negation words play important roles in sentiment analysis. A new encoding strategy, that is, ρ-hot encoding, is proposed to alleviate the drawbacks of one-hot encoding and, thus, effectively incorporate useful lexical cues. Moreover, the sentimental polarity of a word may change in different sentences due to label noise or context. A flipping model is proposed to model the polar flipping of words in a sentence. We compile three Chinese datasets on the basis of our label strategy and proposed methodology. Experiments demonstrate that the proposed method outperforms state-of-the-art algorithms on both benchmark English data and our compiled Chinese data.

Wu Ou, Yang Tao, Li Mengyang, Li Ming


Pathology Pathology

Multiplex Cellular Communities in Multi-Gigapixel Colorectal Cancer Histology Images for Tissue Phenotyping.

In IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

In computational pathology, automated tissue phenotyping in cancer histology images is a fundamental tool for profiling tumor microenvironments. Current tissue phenotyping methods use features derived from image patches which may not carry biological significance. In this work, we propose a novel multiplex cellular community-based algorithm for tissue phenotyping integrating cell-level features within a graph-based hierarchical framework. We demonstrate that such integration offers better performance compared to prior deep learning and texture-based methods as well as to cellular community based methods using uniplex networks. To this end, we construct celllevel graphs using texture, alpha diversity and multi-resolution deep features. Using these graphs, we compute cellular connectivity features which are then employed for the construction of a patch-level multiplex network. Over this network, we compute multiplex cellular communities using a novel objective function. The proposed objective function computes a low-dimensional subspace from each cellular network and subsequently seeks a common low-dimensional subspace using the Grassmann manifold. We evaluate our proposed algorithm on three publicly available datasets for tissue phenotyping, demonstrating a significant improvement over existing state-of-the-art methods.

Javed Sajid, Mahmood Arif, Werghi Naoufel, Benes Ksenija, Rajpoot Nasir


General General

Heterogeneous Graph Attention Network for Unsupervised Multiple-Target Domain Adaptation.

In IEEE transactions on pattern analysis and machine intelligence ; h5-index 127.0

Domain adaptation, which transfers the knowledge from label-rich source domain to unlabeled target domains, is a challenging task in machine learning. The prior domain adaptation methods focus on pairwise adaptation assumption with a single source and a single target domain, while little work concerns the scenario of one source domain and multiple target domains. Applying pairwise adaptation methods to this setting may be suboptimal, as they fail to consider the semantic association among multiple target domains. In this work we propose a deep semantic information propagation approach in the novel context of multiple unlabeled target domains and one labeled source domain. Our model aims to learn a unified subspace common for all domains with a heterogeneous graph attention network, where the transductive ability of the graph attention network can conduct semantic propagation of the related samples among multiple domains. In particular, the attention mechanism is applied to optimize the relationships of multiple domain samples for better semantic transfer. Then, the pseudo labels of the target domains predicted by the graph attention network are utilized to learn domain-invariant representations by aligning labeled source centroid and pseudo-labeled target centroid. We test our approach on four challenging public datasets, and it outperforms several popular domain adaptation methods.

Yang Xu, Deng Cheng, Liu Tongliang, Tao Dacheng


General General

Predicting Hydrophobicity by Learning Spatiotemporal Features of Interfacial Water Structure: Combining Molecular Dynamics Simulations with Convolutional Neural Networks.

In The journal of physical chemistry. B

The hydrophobicity of functionalized interfaces can be quantified by the structure and dynamics of water molecules using molecular dynamics (MD) simulations, but existing methods to quantify interfacial hydrophobicity are computationally expensive. In this work, we develop a new machine learning approach that leverages convolutional neural networks (CNNs) to predict the hydration free energy (HFE) as a measure of interfacial hydrophobicity based on water positions sampled from MD simulations. We construct a set of idealized self-assembled monolayers (SAMs) with varying surface polarities and calculate their HFEs using indirect umbrella sampling calculations (INDUS). Using the INDUS-calculated HFEs as labels and physically informed representations of interfacial water density from MD simulations as input, we train and evaluate a series of neural networks to predict SAM HFEs. By systematically varying model hyperparameters, we demonstrate that a 3D CNN trained to analyze both spatial and temporal correlations between interfacial water molecule positions leads to HFE predictions that require an order of magnitude less MD simulation time than INDUS. We showcase the power of this model to explore a large design space by predicting HFEs for a set of 71 chemically heterogeneous SAMs with varying patterns and mole fractions.

Kelkar Atharva Shailendra, Dallin Bradley C, Van Lehn Reid C