Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Proximal Methods for Plant Stress Detection Using Optical Sensors and Machine Learning.

In Biosensors

Plant stresses have been monitored using the imaging or spectrometry of plant leaves in the visible (red-green-blue or RGB), near-infrared (NIR), infrared (IR), and ultraviolet (UV) wavebands, often augmented by fluorescence imaging or fluorescence spectrometry. Imaging at multiple specific wavelengths (multi-spectral imaging) or across a wide range of wavelengths (hyperspectral imaging) can provide exceptional information on plant stress and subsequent diseases. Digital cameras, thermal cameras, and optical filters have become available at a low cost in recent years, while hyperspectral cameras have become increasingly more compact and portable. Furthermore, smartphone cameras have dramatically improved in quality, making them a viable option for rapid, on-site stress detection. Due to these developments in imaging technology, plant stresses can be monitored more easily using handheld and field-deployable methods. Recent advances in machine learning algorithms have allowed for images and spectra to be analyzed and classified in a fully automated and reproducible manner, without the need for complicated image or spectrum analysis methods. This review will highlight recent advances in portable (including smartphone-based) detection methods for biotic and abiotic stresses, discuss data processing and machine learning techniques that can produce results for stress identification and classification, and suggest future directions towards the successful translation of these methods into practical use.

Zubler Alanna V, Yoon Jeong-Yeol


RGB imaging, abiotic stress, artificial neural network (ANN), fluorescence, hyperspectral imaging, machine learning, plant disease, smartphone imaging, support vector machine (SVM), thermography

General General

Interpretability and Explainability: A Machine Learning Zoo Mini-tour

ArXiv Preprint

In this review, we examine the problem of designing interpretable and explainable machine learning models. Interpretability and explainability lie at the core of many machine learning and statistical applications in medicine, economics, law, and natural sciences. Although interpretability and explainability have escaped a clear universal definition, many techniques motivated by these properties have been developed over the recent 30 years with the focus currently shifting towards deep learning methods. In this review, we emphasise the divide between interpretability and explainability and illustrate these two different research directions with concrete examples of the state-of-the-art. The review is intended for a general machine learning audience with interest in exploring the problems of interpretation and explanation beyond logistic regression or random forest variable importance. This work is not an exhaustive literature survey, but rather a primer focusing selectively on certain lines of research which the authors found interesting or informative.

Ričards Marcinkevičs, Julia E. Vogt


General General

Tackling the challenges of bioimage analysis.

In eLife

Using multiple human annotators and ensembles of trained networks can improve the performance of deep-learning methods in research.

Pelt Daniël M


bioimage informatics, computational biology, deep learning, fluorescence microscopy, mouse, neuroscience, objectivity, reproducibility, systems biology, validity, zebrafish

General General

Practical Workflow from High-Throughput Genotyping to Genomic Estimated Breeding Values (GEBVs).

In Methods in molecular biology (Clifton, N.J.)

The global climate is changing, resulting in significant economic losses worldwide. It is thus necessary to speed up the plant selection process, especially for complex traits such as biotic and abiotic stresses. Nowadays, genomic selection (GS) is paving new ways to boost plant breeding, facilitating the rapid selection of superior genotypes based on the genomic estimated breeding value (GEBV). GEBVs consider all markers positioned throughout the genome, including those with minor effects. Indeed, although the effect of each marker may be very small, a large number of genome-wide markers retrieved by high-throughput genotyping (HTG) systems (mainly genotyping-by-sequencing, GBS) have the potential to explain all the genetic variance for a particular trait under selection. Although several workflows for GBS and GS data have been described, it is still hard for researchers without a bioinformatics background to carry out these analyses. This chapter has outlined some of the recently available bioinformatics resources that enable researchers to establish GBS applications for GS analysis in laboratories. Moreover, we provide useful scripts that could be used for this purpose and a description of key factors that need to be considered in these approaches.

Contaldi Felice, Cappetta Elisa, Esposito Salvatore


Genomic estimated breeding values (GEBVs), Machine learning, Next-generation breeding, Single-nucleotide polymorphisms (SNPs), Stacks, rrBLUP

Pathology Pathology

[Technical, operational, and regulatory considerations for the adoption of digital and computational pathology].

In Der Pathologe

BACKGROUND : Innovative information technologies open new possibilities for diagnostics and promise to improve patient care. However, the integration of data- and computing-intensive procedures into diagnostic workflows also poses risks and considerable challenges for pathologists.

OBJECTIVES : Considering technical, operational, and regulatory aspects, we present a holistic and systematic approach for the adoption of digital and computational pathology.

MATERIAL AND METHODS : We discuss challenges for the implementation of computational diagnostic procedures and analyze regulatory frameworks for risk-based assessment and monitoring of software as an in vitro diagnostic device. Applying regulatory science, we develop an approach to streamline adoption of digital workflows in pathology.

RESULTS : Data- and computing-intensive workflows in digital pathology are complex and underscore the need for computational and regulatory science as a central part of pathological diagnostics. To promote the adoption of computational diagnostics, we have founded an interdisciplinary initiative (the Alliance) that focuses on regulatory research in the field of digital pathology and works closely with a number of expert and interest groups on the precompetitive development of standards for computational workflows.

DISCUSSION : The inclusion of different stakeholder groups and the coordination of technical, operational, and regulatory aspects is necessary to maintain the balance between progress and safety in diagnostics and to make innovations quickly and safely available for patient care.

Herrmann Markus D, Lennerz Jochen K


Artificial intelligence, In vitro diagnostics, Machine learning, Quality assurance, Regulatory science

General General

Borderline personality disorder classification based on brain network measures during emotion regulation.

In European archives of psychiatry and clinical neuroscience

Borderline Personality Disorder (BPD) is characterized by an increased emotional sensitivity and dysfunctional capacity to regulate emotions. While amygdala and prefrontal cortex interactions are regarded as the critical neural mechanisms underlying these problems, the empirical evidence hereof is inconsistent. In the current study, we aimed to systematically test different properties of brain connectivity and evaluate the predictive power to detect borderline personality disorder. Patients with borderline personality disorder (n = 51), cluster C personality disorder (n = 26) and non-patient controls (n = 44), performed an fMRI emotion regulation task. Brain network analyses focused on two properties of task-related connectivity: phasic refers to task-event dependent changes in connectivity, while tonic was defined as task-stable background connectivity. Three different network measures were estimated (strength, local efficiency, and participation coefficient) and entered as separate models in a nested cross-validated linear support vector machine classification analysis. Borderline personality disorder vs. non-patient controls classification showed a balanced accuracy of 55%, which was not significant under a permutation null-model, p = 0.23. Exploratory analyses did indicate that the tonic strength model was the highest performing model (balanced accuracy 62%), and the amygdala was one of the most important features. Despite being one of the largest data-sets in the field of BPD fMRI research, the sample size may have been limited for this type of classification analysis. The results and analytic procedures do provide starting points for future research, focusing on network measures of tonic connectivity, and potentially focusing on subgroups of BPD.

Cremers Henk, van Zutphen Linda, Duken Sascha, Domes Gregor, Sprenger Andreas, Waldorp Lourens, Arntz Arnoud


Borderline personality disorder, Classification, Machine learning, Network measures, Networks analysis, Phasic vs. tonic brain connectivity