Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

LightCUD: a program for diagnosing IBD based on human gut microbiome data.

In BioData mining

BACKGROUND : The diagnosis of inflammatory bowel disease (IBD) and discrimination between the types of IBD are clinically important. IBD is associated with marked changes in the intestinal microbiota. Advances in next-generation sequencing (NGS) technology and the improved hospital bioinformatics analysis ability motivated us to develop a diagnostic method based on the gut microbiome.

RESULTS : Using a set of whole-genome sequencing (WGS) data from 349 human gut microbiota samples with two types of IBD and healthy controls, we assembled and aligned WGS short reads to obtain feature profiles of strains and genera. The genus and strain profiles were used for the 16S-based and WGS-based diagnostic modules construction respectively. We designed a novel feature selection procedure to select those case-specific features. With these features, we built discrimination models using different machine learning algorithms. The machine learning algorithm LightGBM outperformed other algorithms in this study and thus was chosen as the core algorithm. Specially, we identified two small sets of biomarkers (strains) separately for the WGS-based health vs IBD module and ulcerative colitis vs Crohn's disease module, which contributed to the optimization of model performance during pre-training. We released LightCUD as an IBD diagnostic program built with LightGBM. The high performance has been validated through five-fold cross-validation and using an independent test data set. LightCUD was implemented in Python and packaged free for installation with customized databases. With WGS data or 16S rRNA sequencing data of gut microbiome samples as the input, LightCUD can discriminate IBD from healthy controls with high accuracy and further identify the specific type of IBD. The executable program LightCUD was released in open source with instructions at the webpage . The identified strain biomarkers could be used to study the critical factors for disease development and recommend treatments regarding changes in the gut microbial community.

CONCLUSIONS : As the first released human gut microbiome-based IBD diagnostic tool, LightCUD demonstrates a high-performance for both WGS and 16S sequencing data. The strains that either identify healthy controls from IBD patients or distinguish the specific type of IBD are expected to be clinically important to serve as biomarkers.

Xu Congmin, Zhou Man, Xie Zhongjie, Li Mo, Zhu Xi, Zhu Huaiqiu


Biomarker, Human gut microbiome, IBD, Machine learning algorithm

General General

Covid-19 Automated Diagnosis and Risk Assessment through Metabolomics and Machine Learning.

In Analytical chemistry

COVID-19 is still placing a heavy health and financial burden worldwide. Impairment in patient screening and risk management plays a fundamental role on how governments and authorities are directing resources, planning reopening, as well as sanitary countermeasures, especially in regions where poverty is a major component in the equation. An efficient diagnostic method must be highly accurate, while having a cost-effective profile. We combined a machine learning-based algorithm with mass spectrometry to create an expeditious platform that discriminate COVID-19 in plasma samples within minutes, while also providing tools for risk assessment, to assist healthcare professionals in patient management and decision-making. A cross-sectional study enrolled 815 patients (442 COVID-19, 350 controls and 23 COVID-19 suspicious) from three Brazilian epicenters from April to July 2020. We were able to elect and identify 19 molecules related to the disease's pathophysiology and several discriminating features to patient's health-related outcomes. The method applied for COVID-19 diagnosis showed specificity >96% and sensitivity >83%, and specificity >80% and sensitivity >85% during risk assessment, both from blinded data. Our method introduced a new approach for COVID-19 screening, providing the indirect detection of infection through metabolites and contextualizing the findings with the disease's pathophysiology. The pairwise analysis of biomarkers brought robustness to the model developed using machine learning algorithms, transforming this screening approach in a tool with great potential for real-world application.

Delafiori Jeany, Navarro Luiz Cláudio, Siciliano Rinaldo Focaccia, de Melo Gisely Cardoso, Busanello Estela Natacha Brandt, Nicolau José Carlos, Sales Geovana Manzan, de Oliveira Arthur Noin, Val Fernando Fonseca Almeida, de Oliveira Diogo Noin, Eguti Adriana, Dos Santos Luiz Augusto, Dalçóquio Talia Falcão, Bertolin Adriadne Justi, Abreu-Netto Rebeca Linhares, Salsoso Rocio, Baía-da-Silva Djane, Marcondes-Braga Fabiana G, Sampaio Vanderson Souza, Judice Carla Cristina, Costa Fabio Trindade Maranhão, Durán Nelson, Perroud Mauricio Wesley, Sabino Ester Cerdeira, Lacerda Marcus Vinicius Guimarães, Reis Leonardo Oliveira, Fávaro Wagner José, Monteiro Wuelton Marcelo, Rocha Anderson Rezende, Catharino Rodrigo Ramos


General General

JAK inhibitors in immune-mediated rheumatic diseases: From a molecular perspective to clinical studies.

In Journal of molecular graphics & modelling

The Janus Kinase signalling pathway is implicated in the pathogenesis of immune-related diseases. The potency of small-molecule Janus Kinase inhibitors in the treatment of inflammatory diseases demonstrates that this pathway can be successfully targeted for therapeutic purposes. The outstanding relevant questions concerning drugs' efficacy and toxicity challenge the research to enhance the selectivity of these drugs. The promising results of computational techniques, such as Molecular Dynamics and Molecular Docking, coupled with experimental studies, can improve the understanding of the molecular mechanism of Janus Kinase pathway and thus enable the rational design of new more selective inhibitor molecules.

Sperti Michela, Malavolta Marta, Ciniero Gloria, Borrelli Simone, Cavaglià Marco, Muscat Stefano, Tuszynski Jack Adam, Afeltra Antonella, Margiotta Domenico Paolo Emanuele, Navarini Luca


Computational drug discovery, Immune-mediated diseases, Inflammatory diseases, JAK Inhibitors drugs, JAK-STAT Pathway, JAK1, JAK2, Molecular docking, Molecular dynamics, Molecular mechanics, Virtual screening

General General

Data-driven experimental design and model development using Gaussian process with active learning.

In Cognitive psychology

Interest in computational modeling of cognition and behavior continues to grow. To be most productive, modelers should be equipped with tools that ensure optimal efficiency in data collection and in the integrity of inference about the phenomenon of interest. Traditionally, models in cognitive science have been parametric, which are particularly susceptible to model misspecification because their strong assumptions (e.g. parameterization, functional form) may introduce unjustified biases in data collection and inference. To address this issue, we propose a data-driven nonparametric framework for model development, one that also includes optimal experimental design as a goal. It combines Gaussian Processes, a stochastic process often used for regression and classification, with active learning, from machine learning, to iteratively fit the model and use it to optimize the design selection throughout the experiment. The approach, dubbed Gaussian process with active learning (GPAL), is an extension of the parametric, adaptive design optimization (ADO) framework (Cavagnaro, Myung, Pitt, & Kujala, 2010). We demonstrate the application and features of GPAL in a delay discounting task and compare its performance to ADO in two experiments. The results show that GPAL is a viable modeling framework that is noteworthy for its high sensitivity to individual differences, identifying novel patterns in the data that were missed by the model-constrained ADO. This investigation represents a first step towards the development of a data-driven cognitive modeling framework that serves as a middle ground between raw data, which can be difficult to interpret, and parametric models, which rely on strong assumptions.

Chang Jorge, Kim Jiseob, Zhang Byoung-Tak, Pitt Mark A, Myung Jay I


Active learning, Computational cognition, Data-driven cognitive modeling, Delay discounting, Gaussian process, Nonparametric Bayesian methods, Optimal experimental design

Internal Medicine Internal Medicine

Development and validation of a nomogram to predict pulmonary function and the presence of chronic obstructive pulmonary disease in a Korean population.

In BMC pulmonary medicine ; h5-index 38.0

BACKGROUND : Early suspicion followed by assessing lung function with spirometry could decrease the underdiagnosis of chronic obstructive pulmonary disease (COPD) in primary care. We aimed to develop a nomogram to predict the FEV1/FVC ratio and the presence of COPD.

METHODS : We retrospectively reviewed the data of 4241 adult patients who underwent spirometry between 2013 and 2019. By linear regression analysis, variables associated with FEV1/FVC were identified in the training cohort (n = 2969). Using the variables as predictors, a nomogram was created to predict the FEV1/FVC ratio and validated in the test cohort (n = 1272).

RESULTS : Older age (β coefficient [95% CI], - 0.153 [- 0.183, - 0.122]), male sex (- 1.904 [- 2.749, - 1.056]), current or past smoking history (- 3.324 [- 4.200, - 2.453]), and the presence of dyspnea (- 2.453 [- 3.612, - 1.291]) or overweight (0.894 [0.191, 1.598]) were significantly associated with the FEV1/FVC ratio. In the final testing, the developed nomogram showed a mean absolute error of 8.2% between the predicted and actual FEV1/FVC ratios. The overall performance was best when FEV1/FVC < 70% was used as a diagnostic criterion for COPD; the sensitivity, specificity, and balanced accuracy were 82.3%, 68.6%, and 75.5%, respectively.

CONCLUSION : The developed nomogram could be used to identify potential patients at risk of COPD who may need further evaluation, especially in the primary care setting where spirometry is not available.

Lee Sang Chul, An Chansik, Yoo Jongha, Park Sungho, Shin Donggyo, Han Chang Hoon


Chronic obstructive pulmonary disease, Machine learning, Primary care, Spirometry

General General

Artificial Intelligence-assisted chest X-ray assessment scheme for COVID-19.

In European radiology ; h5-index 62.0

OBJECTIVES : To study whether a trained convolutional neural network (CNN) can be of assistance to radiologists in differentiating Coronavirus disease (COVID)-positive from COVID-negative patients using chest X-ray (CXR) through an ambispective clinical study. To identify subgroups of patients where artificial intelligence (AI) can be of particular value and analyse what imaging features may have contributed to the performance of AI by means of visualisation techniques.

METHODS : CXR of 487 patients were classified into [4] categories-normal, classical COVID, indeterminate, and non-COVID by consensus opinion of 2 radiologists. CXR which were classified as "normal" and "indeterminate" were then subjected to analysis by AI, and final categorisation provided as guided by prediction of the network. Precision and recall of the radiologist alone and radiologist assisted by AI were calculated in comparison to reverse transcriptase-polymerase chain reaction (RT-PCR) as the gold standard. Attention maps of the CNN were analysed to understand regions in the CXR important to the AI algorithm in making a prediction.

RESULTS : The precision of radiologists improved from 65.9 to 81.9% and recall improved from 17.5 to 71.75 when assistance with AI was provided. AI showed 92% accuracy in classifying "normal" CXR into COVID or non-COVID. Analysis of attention maps revealed attention on the cardiac shadow in these "normal" radiographs.

CONCLUSION : This study shows how deployment of an AI algorithm can complement a human expert in the determination of COVID status. Analysis of the detected features suggests possible subtle cardiac changes, laying ground for further investigative studies into possible cardiac changes.

KEY POINTS : • Through an ambispective clinical study, we show how assistance with an AI algorithm can improve recall (sensitivity) and precision (positive predictive value) of radiologists in assessing CXR for possible COVID in comparison to RT-PCR. • We show that AI achieves the best results in images classified as "normal" by radiologists. We conjecture that possible subtle cardiac in the CXR, imperceptible to the human eye, may have contributed to this prediction. • The reported results may pave the way for a human computer collaboration whereby the expert with some help from the AI algorithm achieves higher accuracy in predicting COVID status on CXR than previously thought possible when considering either alone.

Rangarajan Krithika, Muku Sumanyu, Garg Amit Kumar, Gabra Pavan, Shankar Sujay Halkur, Nischal Neeraj, Soni Kapil Dev, Bhalla Ashu Seith, Mohan Anant, Tiwari Pawan, Bhatnagar Sushma, Bansal Raghav, Kumar Atin, Gamanagati Shivanand, Aggarwal Richa, Baitha Upendra, Biswas Ashutosh, Kumar Arvind, Jorwal Pankaj, Shalimar Shariff, A Wig, Naveet Subramanium, Rajeshwari Trikha, Anjan Malhotra, Rajesh Guleria, Randeep Namboodiri, Vinay Banerjee, Subhashis Arora


Artificial intelligence, COVID, Radiograph