Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Application of machine learning algorithms to identify cryptic reproductive habitats using diverse information sources.

In Oecologia

Information on ecological systems often comes from diverse sources with varied levels of complexity, bias, and uncertainty. Accordingly, analytical techniques continue to evolve that address these challenges to reveal the characteristics of ecological systems and inform conservation actions. We applied multiple statistical learning algorithms (i.e., machine learning) with a range of information sources including fish tracking data, environmental data, and visual surveys to identify potential spawning aggregation sites for a marine fish species, permit (Trachinotus falcatus), in the Florida Keys. Recognizing the potential complementarity and some level of uncertainty in each information source, we applied supervised (classic and conditional random forests; RF) and unsupervised (fuzzy k-means; FKM) algorithms. The two RF models had similar predictive performance, but generated different predictor variable importance structures and spawning site predictions. Unsupervised clustering using FKM identified unique site groupings that were similar to the likely spawning sites identified with RF. The conservation of aggregate spawning fish species depends heavily on the protection of key spawning sites; many of these potential sites were identified here for permit in the Florida Keys, which consisted of relatively deep-water natural and artificial reefs with high mean permit residency periods. The application of multiple machine learning algorithms enabled the integration of diverse information sources to develop models of an ecological system. Faced with increasingly complex and diverse data sources, ecologists, and conservation practitioners should find increasing value in machine learning algorithms, which we discuss here and provide resources to increase accessibility.

Brownscombe Jacob W, Griffin Lucas P, Morley Danielle, Acosta Alejandro, Hunt John, Lowerre-Barbieri Susan K, Adams Aaron J, Danylchuk Andy J, Cooke Steven J


Conservation, Ecology, Machine learning, Marine biology, Spawning aggregations

Radiology Radiology

How to read and review papers on machine learning and artificial intelligence in radiology: a survival guide to key methodological concepts.

In European radiology ; h5-index 62.0

In recent years, there has been a dramatic increase in research papers about machine learning (ML) and artificial intelligence in radiology. With so many papers around, it is of paramount importance to make a proper scientific quality assessment as to their validity, reliability, effectiveness, and clinical applicability. Due to methodological complexity, the papers on ML in radiology are often hard to evaluate, requiring a good understanding of key methodological issues. In this review, we aimed to guide the radiology community about key methodological aspects of ML to improve their academic reading and peer-review experience. Key aspects of ML pipeline were presented within four broad categories: study design, data handling, modelling, and reporting. Sixteen key methodological items and related common pitfalls were reviewed with a fresh perspective: database size, robustness of reference standard, information leakage, feature scaling, reliability of features, high dimensionality, perturbations in feature selection, class balance, bias-variance trade-off, hyperparameter tuning, performance metrics, generalisability, clinical utility, comparison with traditional tools, data sharing, and transparent reporting.Key Points• Machine learning is new and rather complex for the radiology community.• Validity, reliability, effectiveness, and clinical applicability of studies on machine learning can be evaluated with a proper understanding of key methodological concepts about study design, data handling, modelling, and reporting.• Understanding key methodological concepts will provide a better academic reading and peer-review experience for the radiology community.

Kocak Burak, Kus Ece Ates, Kilickesmez Ozgur


Artificial intelligence, Deep learning, Machine learning, Peer-review, Radiology

Surgery Surgery

Machine Learning Outcome Prediction in Dilated Cardiomyopathy Using Regional Left Ventricular Multiparametric Strain.

In Annals of biomedical engineering ; h5-index 52.0

The clinical presentation of idiopathic dilated cardiomyopathy (IDCM) heart failure (HF) patients who will respond to medical therapy (responders) and those who will not (non-responders) is often similar. A machine learning (ML)-based clinical tool to identify responders would prevent unnecessary surgery, while targeting non-responders for early intervention. We used regional left ventricular (LV) contractile injury patterns in ML models to identify IDCM HF non-responders. MRI-based multiparametric strain analysis was performed in 178 test subjects (140 normal subjects and 38 IDCM patients), calculating longitudinal, circumferential, and radial strain over 18 LV sub-regions for inclusion in ML analyses. Patients were identified as responders based upon symptomatic and contractile improvement on medical therapy. We tested the predictive accuracy of support vector machines (SVM), logistic regression (LR), random forest (RF), and deep neural networks (DNN). The DNN model outperformed other models, predicting response to medical therapy with an area under the receiver operating characteristic curve (AUC) of 0.94. The top features were longitudinal strain in (1) basal: anterior, posterolateral and (2) mid: posterior, anterolateral, and anteroseptal sub-regions. Regional contractile injury patterns predict response to medical therapy in IDCM HF patients, and have potential application in ML-based HF patient care.

MacGregor Robert M, Guo Aixia, Masood Muhammad F, Cupps Brian P, Ewald Gregory A, Pasque Michael K, Foraker Randi


Deep learning, Heart failure, Machine learning, Magnetic resonance imaging, Myocardial strain, Regional contractile injury

General General

Data-Driven Models for Objective Grading Improvement of Parkinson's Disease.

In Annals of biomedical engineering ; h5-index 52.0

Parkinson's disease (PD) is a progressive disorder of the central nervous system that causes motor dysfunctions in affected patients. Objective assessment of symptoms can support neurologists in fine evaluations, improving patients' quality of care. Herein, this study aimed to develop data-driven models based on regression algorithms to investigate the potential of kinematic features to predict PD severity levels. Sixty-four patients with PD (PwPD) and 50 healthy subjects of control (HC) were asked to perform 13 motor tasks from the MDS-UPDRS III while wearing wearable inertial sensors. Simultaneously, the clinician provided the evaluation of the tasks based on the MDS-UPDRS scores. One hundred-ninety kinematic features were extracted from the inertial motor data. Data processing and statistical analysis identified a set of parameters able to distinguish between HC and PwPD. Then, multiple feature selection methods allowed selecting the best subset of parameters for obtaining the greatest accuracy when used as input for several predicting regression algorithms. The maximum correlation coefficient, equal to 0.814, was obtained with the adaptive neuro-fuzzy inference system (ANFIS). Therefore, this predictive model could be useful as a decision support system for a reliable objective assessment of PD severity levels based on motion performance, improving patients monitoring over time.

Butt Abdul Haleem, Rovini Erika, Fujita Hamido, Maremmani Carlo, Cavallo Filippo


ANFIS, Artificial intelligence, Parkinson disease severity, Predictive methods, Regression models

Pathology Pathology

Towards the Probabilistic Analysis of Small Bowel Capsule Endoscopy Features to Predict Severity of Duodenal Histology in Patients with Villous Atrophy.

In Journal of medical systems ; h5-index 48.0

Small bowel capsule endoscopy (SBCE) can be complementary to histological assessment of celiac disease (CD) and serology negative villous atrophy (SNVA). Determining the severity of disease on SBCE using statistical machine learning methods can be useful in the follow up of patients. SBCE can play an additional role in differentiating between CD and SNVA. De-identified SBCEs of patients with CD and SNVA were included. Probabilistic analysis of features on SBCE were used to predict severity of duodenal histology and to distinguish between CD and SNVA. Patients with higher Marsh scores were more likely to have a positive SBCE and a continuous distribution of macroscopic features of disease than those with lower Marsh scores. The same pattern was also true for patients with CD when compared to patients with SNVA. The validation accuracy when predicting the severity of Marsh scores and when distinguishing between CD and SNVA was 69.1% in both cases. When the proportions of each SBCE class group within the dataset were included in the classification model, to distinguish between the two pathologies, the validation accuracy increased to 75.3%. The findings of this work suggest that by using features of CD and SNVA on SBCE, predictions can be made of the type of pathology and the severity of disease.

Chetcuti Zammit Stefania, Bull Lawrence A, Sanders David S, Galvin Jessica, Dervilis Nikolaos, Sidhu Reena, Worden Keith


Celiac disease, Duodenal histology, Probabilistic analysis, Seronegative villous atrophy, Small bowel capsule endoscopy

General General

A survey on deep learning in DNA/RNA motif mining.

In Briefings in bioinformatics

DNA/RNA motif mining is the foundation of gene function research. The DNA/RNA motif mining plays an extremely important role in identifying the DNA- or RNA-protein binding site, which helps to understand the mechanism of gene regulation and management. For the past few decades, researchers have been working on designing new efficient and accurate algorithms for mining motif. These algorithms can be roughly divided into two categories: the enumeration approach and the probabilistic method. In recent years, machine learning methods had made great progress, especially the algorithm represented by deep learning had achieved good performance. Existing deep learning methods in motif mining can be roughly divided into three types of models: convolutional neural network (CNN) based models, recurrent neural network (RNN) based models, and hybrid CNN-RNN based models. We introduce the application of deep learning in the field of motif mining in terms of data preprocessing, features of existing deep learning architectures and comparing the differences between the basic deep learning models. Through the analysis and comparison of existing deep learning methods, we found that the more complex models tend to perform better than simple ones when data are sufficient, and the current methods are relatively simple compared with other fields such as computer vision, language processing (NLP), computer games, etc. Therefore, it is necessary to conduct a summary in motif mining by deep learning, which can help researchers understand this field.

He Ying, Shen Zhen, Zhang Qinhu, Wang Siguo, Huang De-Shuang


convolutional neural network, deep learning, motif mining, protein binding site, recurrent neural networks