Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Surgery Surgery

Evaluation of predictive role of carcinoembryonic antigen and salivary mRNA biomarkers in gastric cancer detection.

In Medicine

We explored the potential of combining carcinoembryonic antigen (CEA) and salivary mRNAs for gastric cancer (GC) detection.This study included 2 phases of study: a biomarker discovery phase and an independent validation phase. In the discovery phase, we measured CEA levels in blood samples and expression level of messenger RNAs (SPINK7, PPL, SEMA4B, SMAD4) in saliva samples of 140 GC patients and 140 healthy controls. We evaluated the clinical performance of each biomarker and developed a predictive model using machine-learning algorithm to differentiate GC patients and healthy controls.Our biomarker panel successfully discriminated GC patients from healthy controls with both high sensitivity (0.94) and high specificity (0.91). We next applied our biomarker panel in the independent validation phase, in which we recruited a new patient cohort of 60 GC patients and 60 healthy controls. Using our biomarker panel, the GC patients were discriminated from healthy controls in the validation phase, with sensitivity of 0.92 and specificity of 0.87.A combination of blood CEA and salivary messenger RNA could be a promising approach to detect GC.

Xu Fei, Jiang Meiquan


Dermatology Dermatology

Serum markers improve current prediction of metastasis development in early-stage melanoma patients: a machine learning-based study.

In Molecular oncology

Metastasis development represents an important threat for melanoma patients, even when diagnosed at early stages and upon removal of the primary tumor. In this scenario, determination of prognostic biomarkers would be of great interest. Serum contains information about the general status of the organism and therefore, represents a valuable source for biomarkers. Thus, we aimed to define serological biomarkers that could be used along with clinical and histopathological features of the disease to predict metastatic events on the early-stage population of patients. We previously demonstrated that in stage II melanoma patients, serum levels of dermcidin (DCD) were associated with metastatic progression. Based on the relevance of the immune response on the cancer progression and the recent association of DCD with local and systemic immune response against cancer cells, serum DCD was analyzed in a new cohort of patients along with IL-4, IL-6, IL-10, IL-17A, IFNγ, TGFβ and GM-CSF. We initially recruited 448 melanoma patients, 323 of whom were diagnosed as stages I-II according to AJCC. Levels of selected cytokines were determined by ELISA and Luminex and obtained data were analyzed employing Machine Learning and Kaplan-Meier techniques to define an algorithm capable of accurately classifying early-stage melanoma patients with a high and low risk of developing metastasis. The results show that in early-stage melanoma patients, serum levels of the cytokines IL-4, GM-CSF and DCD together with the Breslow thickness are those that best predict melanoma metastasis. Moreover, resulting algorithm represents a new tool to discriminate subjects with good prognosis from those with high risk for a future metastasis.

Mancuso Filippo, Lage Sergio, Rasero Javier, Díaz-Ramón José Luis, Apraiz Aintzane, Pérez-Yarza Gorka, Ezkurra Pilar Ariadna, Penas Cristina, Sánchez-Diez Ana, García-Vazquez María Dolores, Gardeazabal Jesús, Izu Rosa, Mujika Karmele, Cortés Jesús, Asumendi Aintzane, Boyano María Dolores


dermcidin, interleukins, melanoma, prognosis, serum biomarkers

General General

Noninvasive detection of focal seizures in ambulatory patients.

In Epilepsia

Reliably detecting focal seizures without secondary generalization during daily life activities, chronically, using convenient portable or wearable devices, would offer patients with active epilepsy a number of potential benefits, such as providing more reliable seizure count to optimize treatment and seizure forecasting, and triggering alarms to promote safeguarding interventions. However, no generic solution is currently available to reach these objectives. A number of biosignals are sensitive to specific forms of focal seizures, in particular heart rate and its variability for seizures affecting the neurovegetative system, and accelerometry for those responsible for prominent motor activity. However, most studies demonstrate high rates of false detection or poor sensitivity, with only a minority of patients benefiting from acceptable levels of accuracy. To tackle this challenging issue, several lines of technological progress are envisioned, including multimodal biosensing with cross-modal analytics, a combination of embedded and distributed self-aware machine learning, and ultra-low-power design to enable appropriate autonomy of such sophisticated portable solutions.

Ryvlin Philippe, Cammoun Leila, Hubbard Ilona, Ravey France, Beniczky Sandor, Atienza David


focal seizure, seizure detection, wearable devices

General General

Combining Cloud-Based Free Energy Calculations, Synthetically Aware Enumerations and Goal-Directed Generative Machine Learning for Rapid Large Scale Chemical Exploration and Optimization.

In Journal of chemical information and modeling

The hit identification process usually involves the profiling of millions to more recently billions of compounds either via traditional experimental high throughput screens (HTS) or computational virtual high throughput screens (vHTS). We have previously demonstrated that by coupling reaction-based enumeration, active learning and free energy calculations, a similarly large scale exploration of chemical space can be extended to the hit-to-lead process. In this work, we augment that approach by coupling large scale enumeration and cloud-based FEP profiling with goal-directed generative machine learning, which results in a higher enrichment of potent ideas compared to large scale enumeration alone, while simultaneously staying within the bounds of a predefined drug-like property space. We are able to achieve this by building the molecular distribution for generative machine learning from the PathFinder rules-based enumeration and optimizing for a weighted sum QSAR based multi-parameter optimization function. We examine the utility of this combined approach by designing potent inhibitors of cyclin-dependant kinase 2 (CDK2) and demonstrate a coupled workflow that can: (1) provide a 6.4 fold enrichment improvement in identifying < 10nM compounds over random selection, and a 1.5 fold enrichment in identifying < 10nM compounds over our previous method (2) rapidly explore relevant chemical space outside the bounds of commercial reagents, (3) use generative ML approaches to "learn" the SAR from large scale in silico enumerations and generate novel idea molecules for a flexible receptor site that are both potent and within relevant physicochemical space and (4) produce over 3,000,000 idea molecules and run 2153 FEP simulations, identifying 69 ideas with a predicted IC50 < 10nM and 358 ideas with a predicted IC50 <100 nM. The reported data suggest combining both reaction-based and generative machine learning for ideation results in a higher enrichment of potent compounds over previously described approaches, and can rapidly accelerate the discovery of novel chemical matter within a predefined potency and property space.

Ghanakota Phani, Bos Pieter H, Konze Kyle, Staker Joshua, Marques Gabriel, Marshall Kyle, Leswing Karl, Abel Robert, Bhat Sathesh


General General

Accelerating ab Initio Simulation via Nested Monte Carlo and Machine Learned Reference Potentials.

In The journal of physical chemistry. B

As a corollary of the rapid advances in computing, <i>ab initio</i> simulation is playing an increasingly important role in modeling materials at the atomic scale. Two strategies are possible, <i>ab initio</i> Monte Carlo (AIMC) and molecular dynamics (AIMD) simulation. The former benefits from exact sampling from the correct thermodynamic distribution, while the latter is typically more efficient with its collective all-atom coordinate updates. Here, using a relatively simple test model comprised of Helium and Argon, we show that AIMC can be brought up to, and even above, the performance levels of AIMD via a hybrid nested sampling / machine learning (ML) strategy. Here, ML provides an accurate classical reference potential (up to three-body explicit interactions) that can pilot long collective Monte Carlo moves that are accepted or rejected <I>in toto</i> ala nested Monte Carlo (NMC); this is in contrast to the single move nature of a naive implementation. Our proposed method only requires a small up front expense from evaluating the <i>ab initio</i> energies and forces of O(100) random configurations for training. Importantly, our method does not totally rely on the trained, assuredly imperfect, interaction. We show that high performance and exact sampling at the desired level of theory can be realized even when the trained interaction has appreciable differences from the <i>ab initio</i> potential. Remarkably, at the highest levels of performance realized via our approach, a pair of statistically uncorrelated atomic configurations can be generated with O(1) <i>ab initio</i> calculations.

Jadrich Ryan B, Leiding Jeffery A


General General

Identification of Potential PBT/POP-Like Chemicals by a Deep Learning Approach Based on 2D Structural Features.

In Environmental science & technology ; h5-index 132.0

Identifying potential persistent organic pollutants (POPs) and persistent, bioaccumulative, and toxic (PBT) substances from industrial chemical inventories are essential for chemical risk assessment, management, and pollution control. Inspired by the connections between chemical structures and their properties, a deep convolutional neural network (DCNN) model was developed to screen potential PBT/POP-like chemicals. For each chemical, a two-dimensional molecular descriptor representation matrix based on 2424 molecular descriptors was used as the model input. The DCNN model was trained via a supervised learning algorithm with 1306 PBT/POP-like chemicals and 9990 chemicals currently known as non-POPs/PBTs. The model can achieve an average prediction accuracy of 95.3±0.6% and an F-measurement of 79.3±2.5% for PBT/POP-like chemicals (positive samples only) on external datasets. The DCNN model was further evaluated with 52 experimentally determined PBT chemicals in the REACH PBT assessment list and correctly recognized 47 chemicals as PBT/non-PBT chemicals. The DCNN model yielded a total of 4011 suspected PBT/POP like chemicals from 58,079 chemicals merged from five published industrial chemical lists. The proportions of PBT/POP-like substances in the chemical inventories were 6.97.8%, higher than a previous estimate of 3-5%. Although additional PBT/POP chemicals were identified, no new family of PBT/POP-like chemicals was observed.

Sun Xiangfei, Zhang Xianming, Muir Derek C G, Zeng Eddy Y