Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Interpreting a Recurrent Neural Network's Predictions of ICU Mortality Risk.

In Journal of biomedical informatics ; h5-index 55.0

Deep learning has demonstrated success in many applications; however, their use in healthcare has been limited due to the lack of transparency into how they generate predictions. Algorithms such as Recurrent Neural Networks (RNNs) when applied to Electronic Medical Records (EMR) introduce additional barriers to transparency because of the sequential processing of the RNN and the multi-modal nature of EMR data. This work seeks to improve transparency by: 1) introducing Learned Binary Masks (LBM) as a method for identifying which EMR variables contributed to an RNN model's risk of mortality (ROM) predictions for critically ill children; and 2) applying KernelSHAP for the same purpose. Given an individual patient, LBM and KernelSHAP both generate an attribution matrix that shows the contribution of each input feature to the RNN's sequence of predictions for that patient. Attribution matrices can be aggregated in many ways to facilitate different levels of analysis of the RNN model and its predictions. Presented are three methods of aggregations and analyses: 1) over volatile time periods within individual patient predictions, 2) over populations of ICU patients sharing specific diagnoses, and 3) across the general population of critically ill children.

Ho Long V, Aczon Melissa, Ledbetter David, Wetzel Randall

2021-Jan-07

Deep Learning, Electronic Medical Records, Feature Attribution, Feature Importance, Model Interpretation, Recurrent Neural Networks

Public Health Public Health

Identifying emerging predictors for adolescent electronic nicotine delivery systems use: A machine learning analysis of the Population Assessment of Tobacco and Health Study.

In Preventive medicine ; h5-index 62.0

Intervention strategies to prevent adolescents from using electronic nicotine delivery systems (ENDS) should be based on robust predictors of ENDS use that may differ from predictors of conventional cigarette use. Literature points to the need for uncovering emerging predictors of ENDS use. This study identified emerging predictors of adolescent ENDS use using machine learning (ML) techniques. We analyzed nationally representative multi-wave longitudinal survey data (2013-2018) drawn from the Population Assessment of Tobacco and Health Study. A sample of adolescents (12-17 years) who never used any tobacco products at baseline and completed Wave 2 (n = 7958), Wave 3 (n = 6260) and Wave 4 (n = 4544) were analyzed. We developed a supervised ML prediction model using the penalized logistic regression to assess self-reported past-month ENDS use (i.e., current use) at Waves 2-4 based on the variables measured at the previous wave. We then extracted important predictors from each model. The penalized logistic regression models showed suitable capability to discriminate between ENDS uses and non-uses at each wave based on the area under the receiver operating characteristic curve and the area under the precision-recall curve. Interestingly, social media use emerged as an important variable in predicting adolescent ENDS use. ML models appear to be a promising method to identify unique population-level predictors for U.S. adolescent ENDS use behaviors. More research is warranted to investigate emerging predictors of ENDS use and experimentally examine the mechanism by which these emerging predictors affect ENDS use behavior across different spectrum of populations.

Han Dae-Hee, Lee Shin Hyung, Lee Shieun, Seo Dong-Chul

2021-Jan-07

Adolescence, Digital media use, Electronic nicotine delivery systems, Machine learning

Public Health Public Health

Sleep classification from wrist-worn accelerometer data using random forests.

In Scientific reports ; h5-index 158.0

Accurate and low-cost sleep measurement tools are needed in both clinical and epidemiological research. To this end, wearable accelerometers are widely used as they are both low in price and provide reasonably accurate estimates of movement. Techniques to classify sleep from the high-resolution accelerometer data primarily rely on heuristic algorithms. In this paper, we explore the potential of detecting sleep using Random forests. Models were trained using data from three different studies where 134 adult participants (70 with sleep disorder and 64 good healthy sleepers) wore an accelerometer on their wrist during a one-night polysomnography recording in the clinic. The Random forests were able to distinguish sleep-wake states with an F1 score of 73.93% on a previously unseen test set of 24 participants. Detecting when the accelerometer is not worn was also successful using machine learning ([Formula: see text]), and when combined with our sleep detection models on day-time data provide a sleep estimate that is correlated with self-reported habitual nap behaviour ([Formula: see text]). These Random forest models have been made open-source to aid further research. In line with literature, sleep stage classification turned out to be difficult using only accelerometer data.

Sundararajan Kalaivani, Georgievska Sonja, Te Lindert Bart H W, Gehrman Philip R, Ramautar Jennifer, Mazzotti Diego R, Sabia Séverine, Weedon Michael N, van Someren Eus J W, Ridder Lars, Wang Jian, van Hees Vincent T

2021-Jan-08

General General

The fecal mycobiome in patients with Irritable Bowel Syndrome.

In Scientific reports ; h5-index 158.0

Alterations of the gut microbiota have been reported in various gastrointestinal disorders, but knowledge of the mycobiome is limited. We investigated the gut mycobiome of 80 patients with Irritable Bowel Syndrome (IBS) in comparison with 64 control subjects. The fungal-specific internal transcribed spacer 1 (ITS-1) amplicon was sequenced, and mycobiome zero-radius operational taxonomic units (zOTUs) were defined representing known and unknown species and strains. The fungal community was sparse and individual-specific in all (both IBS and control) subjects. Although beta-diversity differed significantly between IBS and controls, no difference was found among clinical subtypes of IBS or in comparison with the mycobiome of subjects with bile acid malabsorption (BAM), a condition which may overlap with IBS with diarrhoea. The mycobiome alterations co-varied significantly with the bacteriome and metabolome but were not linked with dietary habits. As a putative biomarker of IBS, the predictive power of the fecal mycobiome in machine learning models was significantly better than random but insufficient for clinical diagnosis. The mycobiome presents limited therapeutic and diagnostic potential for IBS, despite co-variation with bacterial components which do offer such potential.

Das A, O’Herlihy E, Shanahan F, O’Toole P W, Jeffery I B

2021-Jan-08

Radiology Radiology

A 3D-CNN model with CT-based parametric response mapping for classifying COPD subjects.

In Scientific reports ; h5-index 158.0

Chronic obstructive pulmonary disease (COPD) is a respiratory disorder involving abnormalities of lung parenchymal morphology with different severities. COPD is assessed by pulmonary-function tests and computed tomography-based approaches. We introduce a new classification method for COPD grouping based on deep learning and a parametric-response mapping (PRM) method. We extracted parenchymal functional variables of functional small airway disease percentage (fSAD%) and emphysema percentage (Emph%) with an image registration technique, being provided as input parameters of 3D convolutional neural network (CNN). The integrated 3D-CNN and PRM (3D-cPRM) achieved a classification accuracy of 89.3% and a sensitivity of 88.3% in five-fold cross-validation. The prediction accuracy of the proposed 3D-cPRM exceeded those of the 2D model and traditional 3D CNNs with the same neural network, and was comparable to that of 2D pretrained PRM models. We then applied a gradient-weighted class activation mapping (Grad-CAM) that highlights the key features in the CNN learning process. Most of the class-discriminative regions appeared in the upper and middle lobes of the lung, consistent with the regions of elevated fSAD% and Emph% in COPD subjects. The 3D-cPRM successfully represented the parenchymal abnormalities in COPD and matched the CT-based diagnosis of COPD.

Ho Thao Thi, Kim Taewoo, Kim Woo Jin, Lee Chang Hyun, Chae Kum Ju, Bak So Hyeon, Kwon Sung Ok, Jin Gong Yong, Park Eun-Kee, Choi Sanghun

2021-Jan-08

General General

Microbial production of multiple short-chain primary amines via retrobiosynthesis.

In Nature communications ; h5-index 260.0

Bio-based production of many chemicals is not yet possible due to the unknown biosynthetic pathways. Here, we report a strategy combining retrobiosynthesis and precursor selection step to design biosynthetic pathways for multiple short-chain primary amines (SCPAs) that have a wide range of applications in chemical industries. Using direct precursors of 15 target SCPAs determined by the above strategy, Streptomyces viridifaciens vlmD encoding valine decarboxylase is examined as a proof-of-concept promiscuous enzyme both in vitro and in vivo for generating SCPAs from their precursors. Escherichia coli expressing the heterologous vlmD produces 10 SCPAs by feeding their direct precursors. Furthermore, metabolically engineered E. coli strains are developed to produce representative SCPAs from glucose, including the one producing 10.67 g L-1 of iso-butylamine by fed-batch culture. This study presents the strategy of systematically designing biosynthetic pathways for the production of a group of related chemicals as demonstrated by multiple SCPAs as examples.

Kim Dong In, Chae Tong Un, Kim Hyun Uk, Jang Woo Dae, Lee Sang Yup

2021-01-08