Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

oncology Oncology

Improving prognostic performance in resectable pancreatic ductal adenocarcinoma using radiomics and deep learning features fusion in CT images.

In Scientific reports ; h5-index 158.0

As an analytic pipeline for quantitative imaging feature extraction and analysis, radiomics has grown rapidly in the past decade. On the other hand, recent advances in deep learning and transfer learning have shown significant potential in the quantitative medical imaging field, raising the research question of whether deep transfer learning features have predictive information in addition to radiomics features. In this study, using CT images from Pancreatic Ductal Adenocarcinoma (PDAC) patients recruited in two independent hospitals, we discovered most transfer learning features have weak linear relationships with radiomics features, suggesting a potential complementary relationship between these two feature sets. We also tested the prognostic performance for overall survival using four feature fusion and reduction methods for combining radiomics and transfer learning features and compared the results with our proposed risk score-based feature fusion method. It was shown that the risk score-based feature fusion method significantly improves the prognosis performance for predicting overall survival in PDAC patients compared to other traditional feature reduction methods used in previous radiomics studies (40% increase in area under ROC curve (AUC) yielding AUC of 0.84).

Zhang Yucheng, Lobo-Mueller Edrise M, Karanicolas Paul, Gallinger Steven, Haider Masoom A, Khalvati Farzad


General General

G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes.

In Scientific reports ; h5-index 158.0

In clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes.

Le Borgne Florent, Chatton Arthur, Léger Maxime, Lenain Rémi, Foucher Yohann


Ophthalmology Ophthalmology

Inferred retinal sensitivity in recessive Stargardt disease using machine learning.

In Scientific reports ; h5-index 158.0

Spatially-resolved retinal function can be measured by psychophysical testing like fundus-controlled perimetry (FCP or 'microperimetry'). It may serve as a performance outcome measure in emerging interventional clinical trials for macular diseases as requested by regulatory agencies. As FCP constitute laborious examinations, we have evaluated a machine-learning-based approach to predict spatially-resolved retinal function ('inferred sensitivity') based on microstructural imaging (obtained by spectral domain optical coherence tomography) and patient data in recessive Stargardt disease. Using nested cross-validation, prediction accuracies of (mean absolute error, MAE [95% CI]) 4.74 dB [4.48-4.99] were achieved. After additional inclusion of limited FCP data, the latter reached 3.89 dB [3.67-4.10] comparable to the test-retest MAE estimate of 3.51 dB [3.11-3.91]. Analysis of the permutation importance revealed, that the IS&OS and RPE thickness were the most important features for the prediction of retinal sensitivity. 'Inferred sensitivity', herein, enables to accurately estimate differential effects of retinal microstructure on spatially-resolved function in Stargardt disease, and might be used as quasi-functional surrogate marker for a refined and time-efficient investigation of possible functionally relevant treatment effects or disease progression.

Müller Philipp L, Odainic Alexandru, Treis Tim, Herrmann Philipp, Tufail Adnan, Holz Frank G, Pfau Maximilian


General General

Predicting bacteriophage hosts based on sequences of annotated receptor-binding proteins.

In Scientific reports ; h5-index 158.0

Nowadays, bacteriophages are increasingly considered as an alternative treatment for a variety of bacterial infections in cases where classical antibiotics have become ineffective. However, characterizing the host specificity of phages remains a labor- and time-intensive process. In order to alleviate this burden, we have developed a new machine-learning-based pipeline to predict bacteriophage hosts based on annotated receptor-binding protein (RBP) sequence data. We focus on predicting bacterial hosts from the ESKAPE group, Escherichia coli, Salmonella enterica and Clostridium difficile. We compare the performance of our predictive model with that of the widely used Basic Local Alignment Search Tool (BLAST). Our best-performing predictive model reaches Precision-Recall Area Under the Curve (PR-AUC) scores between 73.6 and 93.8% for different levels of sequence similarity in the collected data. Our model reaches a performance comparable to that of BLASTp when sequence similarity in the data is high and starts outperforming BLASTp when sequence similarity drops below 75%. Therefore, our machine learning methods can be especially useful in settings in which sequence similarity to other known sequences is low. Predicting the hosts of novel metagenomic RBP sequences could extend our toolbox to tune the host spectrum of phages or phage tail-like bacteriocins by swapping RBPs.

Boeckaerts Dimitri, Stock Michiel, Criel Bjorn, Gerstmans Hans, De Baets Bernard, Briers Yves


Public Health Public Health

Does Last Year's Cost Predict the Present Cost? An Application of Machine Leaning for the Japanese Area-Basis Public Health Insurance Database.

In International journal of environmental research and public health ; h5-index 73.0

The increasing healthcare cost imposes a large economic burden for the Japanese government. Predicting the healthcare cost may be a useful tool for policy making. A database of the area-basis public health insurance of one city was analyzed to predict the medical healthcare cost by the dental healthcare cost with a machine learning strategy. The 30,340 subjects who had continued registration of the area-basis public health insurance of Ebina city during April 2017 to September 2018 were analyzed. The sum of the healthcare cost was JPY 13,548,831,930. The per capita healthcare cost was JPY 446,567. The proportion of medical healthcare cost, medication cost, and dental healthcare cost was 78%, 15%, and 7%, respectively. By the results of the neural network model, the medical healthcare cost proportionally depended on the medical healthcare cost of the previous year. The dental healthcare cost of the previous year had a reducing effect on the medical healthcare cost. However, the effect was very small. Oral health may be a risk for chronic diseases. However, when evaluated by the healthcare cost, its effect was very small during the observation period.

Nomura Yoshiaki, Ishii Yoshimasa, Chiba Yota, Suzuki Shunsuke, Suzuki Akira, Suzuki Senichi, Morita Kenji, Tanabe Joji, Yamakawa Koji, Ishiwata Yasuo, Ishikawa Meu, Sogabe Kaoru, Kakuta Erika, Okada Ayako, Otsuka Ryoko, Hanada Nobuhiro


dental healthcare cost, healthcare cost, medical healthcare cost, neural network, zero-inflated model

General General

Learning the language of viral evolution and escape.

In Science (New York, N.Y.)

The ability for viruses to mutate and evade the human immune system and cause infection, called viral escape, remains an obstacle to antiviral and vaccine development. Understanding the complex rules that govern escape could inform therapeutic design. We modeled viral escape with machine learning algorithms originally developed for human natural language. We identified escape mutations as those that preserve viral infectivity but cause a virus to look different to the immune system, akin to word changes that preserve a sentence's grammaticality but change its meaning. With this approach, language models of influenza hemagglutinin, HIV-1 envelope glycoprotein (HIV Env), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Spike viral proteins can accurately predict structural escape patterns using sequence data alone. Our study represents a promising conceptual bridge between natural language and viral evolution.

Hie Brian, Zhong Ellen D, Berger Bonnie, Bryson Bryan