Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Machine-Learning Guided Quantum Chemical and Molecular Dynamics Calculations to Design Novel Hole Conducting Organic Materials.

In The journal of physical chemistry. A

Materials exhibiting higher mobilities than conventional organic semiconducting materials such as fullerenes and fused thiophenes are in high demand for applications such as printed electronics, organic solar cells, and image sensors. In order to discover new molecules that might show improved charge mobility, combined density functional theory (DFT) and molecular dynamics (MD) calculations were performed, guided by predictions from machine learning (ML). A ML model was constructed based on 32 values of theoretically calculated hole mobilities for thiophene derivatives, benzodifuran derivatives, a carbazole derivative and a perylene diimide derivative with the maximum value of 10-1.96 cm2/(Vs). Sequential learning, also known as active learning, was applied to select compounds on which to perform DFT/MD calculation of hole mobility to simultaneously improve the mobility surrogate model and identify high mobility compounds. By performing 60 cycles of sequential learning with 165 DFT/MD calculations, a molecule having a fused thioacene structure with its calculated hole mobility of 10-1.86 cm2/(Vs) was identified. This values is higher than the maximum value of mobility in the initial training dataset, showing that an extrapolative discovery could be made with the sequential learning.

Antono Erin, Matsuzawa Nobuyuki N, Ling Julia, Saal James Edward, Arai Hideyuki, Sasago Masaru, Fujii Eiji


General General

Machine Learning Guided 3D Printing of Tissue Engineering Scaffolds.

In Tissue engineering. Part A

Various material compositions have been successfully used in 3D printing with promising applications as scaffolds in tissue engineering. However, identifying suitable printing conditions for new materials requires extensive experimentation in a time and resource-demanding process. This study investigates the use of Machine Learning (ML) for distinguishing between printing configurations that are likely to result in low quality prints and printing configurations that are more promising as a first step towards the development of a recommendation system for identifying suitable printing conditions. The ML-based framework takes as input the printing conditions regarding the material composition and the printing parameters and predicts the quality of the resulting print as either "low" or "high". We investigate two ML-based approaches: a direct classification-based approach that trains a classifier to distinguish between "low" and "high" quality prints and an indirect approach that uses a regression ML model that approximates the values of a printing quality metric. Both models are built upon Random Forests. We trained and evaluated the models on a dataset that was generated in a previous study which investigated fabrication of porous polymer scaffolds by means of extrusion-based 3D printing with a full-factorial design. Our results show that both models were able to correctly label the majority of the tested configurations while a simpler linear ML model was not effective. Additionally our analysis showed that a full factorial design for data collection can lead to redundancies in the data, in the context of ML, and we propose a more efficient data collection strategy.

Conev Anja, Litsa Eleni, Perez Marissa, Diba Mani, Mikos Antonios G, Kavraki Lydia


General General

Artificial Intelligence Effecting a Paradigm Shift in Drug Development.

In SLAS technology

The inverse relationship between the cost of drug development and the successful integration of drugs into the market has resulted in the need for innovative solutions to overcome this burgeoning problem. This problem could be attributed to several factors, including the premature termination of clinical trials, regulatory factors, or decisions made in the earlier drug development processes. The introduction of artificial intelligence (AI) to accelerate and assist drug development has resulted in cheaper and more efficient processes, ultimately improving the success rates of clinical trials. This review aims to showcase and compare the different applications of AI technology that aid automation and improve success in drug development, particularly in novel drug target identification and design, drug repositioning, biomarker identification, and effective patient stratification, through exploration of different disease landscapes. In addition, it will also highlight how these technologies are translated into the clinic. This paradigm shift will lead to even greater advancements in the integration of AI in automating processes within drug development and discovery, enabling the probability and reality of attaining future precision and personalized medicine.

Rashid Masturah Bte Mohd Abdul


artificial intelligence, drug development, drug discovery, industry

General General

Mathematical Models of Meal Amount and Timing Variability With Implementation in the Type-1 Diabetes Patient Decision Simulator.

In Journal of diabetes science and technology ; h5-index 38.0

BACKGROUND : In type 1 diabetes (T1D) research, in-silico clinical trials (ISCTs) have proven effective in accelerating the development of new therapies. However, published simulators lack a realistic description of some aspects of patient lifestyle which can remarkably affect glucose control. In this paper, we develop a mathematical description of meal carbohydrates (CHO) amount and timing, with the aim to improve the meal generation module in the T1D Patient Decision Simulator (T1D-PDS) published in Vettoretti et al.

METHODS : Data of 32 T1D subjects under free-living conditions for 4874 days were used. Univariate probability density function (PDF) parametric models with different candidate shapes were fitted, individually, against sample distributions of: CHO amounts of breakfast (CHOB), lunch (CHOL), dinner (CHOD), and snack (CHOS); breakfast timing (TB); and time between breakfast-lunch (TBL) and between lunch-dinner (TLD). Furthermore, a support vector machine (SVM) classifier was developed to predict the occurrence of a snack in future fixed-length time windows. Once embedded inside the T1D-PDS, an ISCT was performed.

RESULTS : Resulting PDF models were: gamma (CHOB, CHOS), lognormal (CHOL, TB), loglogistic (CHOD), and generalized-extreme-values (TBL, TLD). The SVM showed a classification accuracy of 0.8 over the test set. The distributions of simulated meal data were not statistically different from the distributions of the real data used to develop the models (α = 0.05).

CONCLUSIONS : The models of meal amount and timing variability developed are suitable for describing real data. Their inclusion in modules that describe patient behavior in the T1D-PDS can permit investigators to perform more realistic, reliable, and insightful ISCTs.

Camerlingo Nunzio, Vettoretti Martina, Del Favero Simone, Facchinetti Andrea, Sparacino Giovanni


in-silico clinical trials, machine learning, maximum absolute difference, parametric modelling, support vector machine

General General

Automatic diagnosis of macular diseases from OCT volume based on its two-dimensional feature map and convolutional neural network with attention mechanism.

In Journal of biomedical optics

SIGNIFICANCE : Automatic and accurate classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images is essential for assisting ophthalmologist in the diagnosis and grading of macular diseases. Therefore, more effective OCT volume classification for automatic recognition of macular diseases is needed.

AIM : For OCT volumes in which only OCT volume-level labels are known, OCT volume classifiers based on its global feature and deep learning are designed, validated, and compared with other methods.

APPROACH : We present a general framework to classify OCT volume for automatic recognizing macular diseases. The architecture of the framework consists of three modules: B-scan feature extractor, two-dimensional (2-D) feature map generation, and volume-level classifier. Our architecture could address OCT volume classification using two 2-D image machine learning classification algorithms. Specifically, a convolutional neural network (CNN) model is trained and used as a B-scan feature extractor to construct a 2-D feature map of an OCT volume and volume-level classifiers such as support vector machine and CNN with/without attention mechanism for 2-D feature maps are described.

RESULTS : Our proposed methods are validated on the publicly available Duke dataset, which consists of 269 intermediate age-related macular degeneration (AMD) volumes and 115 normal volumes. Fivefold cross-validation was done, and average accuracy, sensitivity, and specificity of 98.17%, 99.26%, and 95.65%, respectively, are achieved. The experiments show that our methods outperform the state-of-the-art methods. Our methods are also validated on our private clinical OCT volume dataset, consisting of 448 AMD volumes and 462 diabetic macular edema volumes.

CONCLUSIONS : We present a general framework of OCT volume classification based on its 2-D feature map and CNN with attention mechanism and describe its implementation schemes. Our proposed methods could classify OCT volumes automatically and effectively with high accuracy, and they are a potential practical tool for screening of ophthalmic diseases from OCT volume.

Sun Yankui, Zhang Haoran, Yao Xianlin


attention mechanism, convolutional neural network, image classification, optical coherence tomography, transfer learning

General General

Deep learning in proteomics.

In Proteomics

Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures have been comprehensively catalogued in online databases. With the recent advancements of the tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich research scientific domains. Here, we provide a comprehensive overview of deep learning applications in proteomics including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding affinity prediction, and protein structure prediction. We also discuss limitations and the future directions of deep learning in proteomics. We hope this review will provide readers an overview of deep learning and how it can be used to analyze proteomics data. This article is protected by copyright. All rights reserved.

Wen Bo, Zeng Wenfeng, Liao Yuxing, Shi Zhiao, Savage Sara R, Jiang Wen, Zhang Bing


bioinformatics, deep learning, proteomics