Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Surgery Surgery

AI-doscopist: a real-time deep-learning-based algorithm for localising polyps in colonoscopy videos with edge computing devices.

In NPJ digital medicine

We have designed a deep-learning model, an "Artificial Intelligent Endoscopist (a.k.a. AI-doscopist)", to localise colonic neoplasia during colonoscopy. This study aims to evaluate the agreement between endoscopists and AI-doscopist for colorectal neoplasm localisation. AI-doscopist was pre-trained by 1.2 million non-medical images and fine-tuned by 291,090 colonoscopy and non-medical images. The colonoscopy images were obtained from six databases, where the colonoscopy images were classified into 13 categories and the polyps' locations were marked image-by-image by the smallest bounding boxes. Seven categories of non-medical images, which were believed to share some common features with colorectal polyps, were downloaded from an online search engine. Written informed consent were obtained from 144 patients who underwent colonoscopy and their full colonoscopy videos were prospectively recorded for evaluation. A total of 128 suspicious lesions were resected or biopsied for histological confirmation. When evaluated image-by-image on the 144 full colonoscopies, the specificity of AI-doscopist was 93.3%. AI-doscopist were able to localise 124 out of 128 polyps (polyp-based sensitivity = 96.9%). Furthermore, after reviewing the suspected regions highlighted by AI-doscopist in a 102-patient cohort, an endoscopist has high confidence in recognizing four missed polyps in three patients who were not diagnosed with any lesion during their original colonoscopies. In summary, AI-doscopist can localise 96.9% of the polyps resected by the endoscopists. If AI-doscopist were to be used in real-time, it can potentially assist endoscopists in detecting one more patient with polyp in every 20-33 colonoscopies.

Poon Carmen C Y, Jiang Yuqi, Zhang Ruikai, Lo Winnie W Y, Cheung Maggie S H, Yu Ruoxi, Zheng Yali, Wong John C T, Liu Qing, Wong Sunny H, Mak Tony W C, Lau James Y W


Cancer, Translational research

General General

Cognitive plausibility in voice-based AI health counselors.

In NPJ digital medicine

Voice-based personal assistants using artificial intelligence (AI) have been widely adopted and used in home-based settings. Their success has created considerable interest for its use in healthcare applications; one area of prolific growth in AI is that of voice-based virtual counselors for mental health and well-being. However, in spite of its promise, building realistic virtual counselors to achieve higher-order maturity levels beyond task-based interactions presents considerable conceptual and pragmatic challenges. We describe one such conceptual challenge-cognitive plausibility, defined as the ability of virtual counselors to emulate the human cognitive system by simulating how a skill or function is accomplished. An important cognitive plausibility consideration for voice-based agents is its ability to engage in meaningful and seamless interactive communication. Drawing on a broad interdisciplinary research literature and based on our experiences with developing two voice-based (voice-only) prototypes that are in the early phases of testing, we articulate two conceptual considerations for their design and use-conceptualizing voice-based virtual counselors as communicative agents and establishing virtual co-presence. We discuss why these conceptual considerations are important and how it can lead to the development of voice-based counselors for real-world use.

Kannampallil Thomas, Smyth Joshua M, Jones Steve, Payne Philip R O, Ma Jun


Health services, Translational research

General General

A Machine Learning Approach to Identification of Unhealthy Drinking.

In Journal of the American Board of Family Medicine : JABFM

INTRODUCTION : Unhealthy drinking is prevalent in the United States, and yet it is underidentified and undertreated. Identifying unhealthy drinkers can be time-consuming and uncomfortable for primary care providers. An automated rule for identification would focus attention on patients most likely to need care and, therefore, increase efficiency and effectiveness. The objective of this study was to build a clinical prediction tool for unhealthy drinking based on routinely available demographic and laboratory data.

METHODS : We obtained 38 demographic and laboratory variables from the National Health and Nutrition Examination Survey (1999 to 2016) on 43,545 nationally representative adults who had information on alcohol use available as a reference standard. Logistic regression, support vector machines, k-nearest neighbor, neural networks, decision trees, and random forests were used to build clinical prediction models. The model with the largest area under the receiver operator curve was selected to build the prediction tool.

RESULTS : A random forest model with 15 variables produced the largest area under the receiver operator curve (0.78) in the test set. The most influential predictors were age, current smoker, hemoglobin, sex, and high-density lipoprotein. The optimum operating point had a sensitivity of 0.50, specificity of 0.86, positive predictive value of 0.55, and negative predictive value of 0.83. Application of the tool resulted in a much smaller target sample (75% reduced).

CONCLUSION : Using commonly available data, a decision tool can identify a subset of patients who seem to warrant clinical attention for unhealthy drinking, potentially increasing the efficiency and reach of screening.

Bonnell Levi N, Littenberg Benjamin, Wshah Safwan R, Rose Gail L

Alcohol Drinking, Alcoholism, Area Under Curve, Clinical Decision Rules, Decision Trees, Logistic Models, Machine Learning, Neural Networks (Computer), Nutrition Surveys, Support Vector Machine

General General

QTG-Finder2: A Generalized Machine-Learning Algorithm for Prioritizing QTL Causal Genes in Plants.

In G3 (Bethesda, Md.)

Linkage mapping has been widely used to identify quantitative trait loci (QTL) in many plants and usually requires a time-consuming and labor-intensive fine mapping process to find the causal gene underlying the QTL. Previously, we described QTG-Finder, a machine-learning algorithm to rationally prioritize candidate causal genes in QTLs. While it showed good performance, QTG-Finder could only be used in Arabidopsis and rice because of the limited number of known causal genes in other species. Here we tested the feasibility of enabling QTG-Finder to work on species that have few or no known causal genes by using orthologs of known causal genes as training set. The model trained with orthologs could recall about 64% of Arabidopsis and 83% of rice causal genes when the top 20% ranked genes were considered, which is similar to the performance of models trained with known causal genes. The average precision was 0.027 for Arabidopsis and 0.029 for rice. We further extended the algorithm to include polymorphisms in conserved non-coding sequences and gene presence/absence variation as additional features. Using this algorithm, QTG-Finder2, we trained and cross-validated Sorghum bicolor and Setaria viridis models. The S. bicolor model was validated by causal genes curated from the literature and could recall 70% of causal genes when the top 20% ranked genes were considered. In addition, we applied the S. viridis model and public transcriptome data to prioritize a plant height QTL and identified 13 candidate genes. QTL-Finder2 can accelerate the discovery of causal genes in any plant species and facilitate agricultural trait improvement.

Lin Fan, Lazarus Elena Z, Rhee Seung Y


Setaria viridis, Sorghum bicolor, causal genes, machine learning, quantitative trait loci

General General

Predictors of remission from body dysmorphic disorder after internet-delivered cognitive behavior therapy: a machine learning approach.

In BMC psychiatry

BACKGROUND : Previous attempts to identify predictors of treatment outcomes in body dysmorphic disorder (BDD) have yielded inconsistent findings. One way to increase precision and clinical utility could be to use machine learning methods, which can incorporate multiple non-linear associations in prediction models.

METHODS : This study used a random forests machine learning approach to test if it is possible to reliably predict remission from BDD in a sample of 88 individuals that had received internet-delivered cognitive behavioral therapy for BDD. The random forest models were compared to traditional logistic regression analyses.

RESULTS : Random forests correctly identified 78% of participants as remitters or non-remitters at post-treatment. The accuracy of prediction was lower in subsequent follow-ups (68, 66 and 61% correctly classified at 3-, 12- and 24-month follow-ups, respectively). Depressive symptoms, treatment credibility, working alliance, and initial severity of BDD were among the most important predictors at the beginning of treatment. By contrast, the logistic regression models did not identify consistent and strong predictors of remission from BDD.

CONCLUSIONS : The results provide initial support for the clinical utility of machine learning approaches in the prediction of outcomes of patients with BDD.


Flygare Oskar, Enander Jesper, Andersson Erik, Ljótsson Brjánn, Ivanov Volen Z, Mataix-Cols David, Rück Christian


Body dysmorphic disorder, Cognitive behaviour therapy, Internet, Machine learning, Predictor

Radiology Radiology

Longitudinal functional and imaging outcome measures in FKRP limb-girdle muscular dystrophy.

In BMC neurology

BACKGROUND : Pathogenic variants in the FKRP gene cause impaired glycosylation of α-dystroglycan in muscle, producing a limb-girdle muscular dystrophy with cardiomyopathy. Despite advances in understanding the pathophysiology of FKRP-associated myopathies, clinical research in the limb-girdle muscular dystrophies has been limited by the lack of normative biomarker data to gauge disease progression.

METHODS : Participants in a phase 2 clinical trial were evaluated over a 4-month, untreated lead-in period to evaluate repeatability and to obtain normative data for timed function tests, strength tests, pulmonary function, and body composition using DEXA and whole-body MRI. Novel deep learning algorithms were used to analyze MRI scans and quantify muscle, fat, and intramuscular fat infiltration in the thighs. T-tests and signed rank tests were used to assess changes in these outcome measures.

RESULTS : Nineteen participants were observed during the lead-in period for this trial. No significant changes were noted in the strength, pulmonary function, or body composition outcome measures over the 4-month observation period. One timed function measure, the 4-stair climb, showed a statistically significant difference over the observation period. Quantitative estimates of muscle, fat, and intramuscular fat infiltration from whole-body MRI corresponded significantly with DEXA estimates of body composition, strength, and timed function measures.

CONCLUSIONS : We describe normative data and repeatability performance for multiple physical function measures in an adult FKRP muscular dystrophy population. Our analysis indicates that deep learning algorithms can be used to quantify healthy and dystrophic muscle seen on whole-body imaging.

TRIAL REGISTRATION : This study was retrospectively registered in (NCT02841267) on July 22, 2016 and data supporting this study has been submitted to this registry.

Leung Doris G, Bocchieri Alex E, Ahlawat Shivani, Jacobs Michael A, Parekh Vishwa S, Braverman Vladimir, Summerton Katherine, Mansour Jennifer, Bibat Genila, Morris Carl, Marraffino Shannon, Wagner Kathryn R


Biomarkers, Convolutional neural network, Deep learning, FKRP, Limb-girdle muscular dystrophy, Tissue signatures, Whole-body MRI