Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Public Health Public Health

Evaluation of the health impacts of the 1990 Clean Air Act Amendments using causal inference and machine learning.

In Journal of the American Statistical Association

We develop a causal inference approach to estimate the number of adverse health events that were prevented due to changes in exposure to multiple pollutants attributable to a large-scale air quality intervention/regulation, with a focus on the 1990 Clean Air Act Amendments (CAAA). We introduce a causal estimand called the Total Events Avoided (TEA) by the regulation, defined as the difference in the number of health events expected under the no-regulation pollution exposures and the number observed with-regulation. We propose matching and machine learning methods that leverage population-level pollution and health data to estimate the TEA. Our approach improves upon traditional methods for regulation health impact analyses by formalizing causal identifying assumptions, utilizing population-level data, minimizing parametric assumptions, and collectively analyzing multiple pollutants. To reduce model-dependence, our approach estimates cumulative health impacts in the subset of regions with projected no-regulation features lying within the support of the observed with-regulation data, thereby providing a conservative but data-driven assessment to complement traditional parametric approaches. We analyze the health impacts of the CAAA in the US Medicare population in the year 2000, and our estimates suggest that large numbers of cardiovascular and dementia-related hospitalizations were avoided due to CAAA-attributable changes in pollution exposure.

Nethery Rachel C, Mealli Fabrizia, Sacks Jason D, Dominici Francesca


1990 Clean Air Act Amendments, Bayesian Additive Regression Trees, Counterfactual Pollution Exposures, Matching

Radiology Radiology

Modeling the progression of COVID-19 deaths using Kalman Filter and AutoML.

In Soft computing

The COVID-19 pandemic continues to have a destructive effect on the health and well-being of the global population. A vital step in the battle against it is the successful screening of infected patients, together with one of the effective screening methods being radiology examination using chest radiography. Recognition of epidemic growth patterns across temporal and social factors can improve our capability to create epidemic transmission designs, including the critical job of predicting the estimated intensity of the outbreak morbidity or mortality impact at the end. The study's primary motivation is to be able to estimate with a certain level of accuracy the number of deaths due to COVID-19, managing to model the progression of the pandemic. Predicting the number of possible deaths from COVID-19 can provide governments and decision-makers with indicators for purchasing respirators and pandemic prevention policies. Thus, this work presents itself as an essential contribution to combating the pandemic. Kalman Filter is a widely used method for tracking and navigation and filtering and time series. Designing and tuning machine learning methods are a labor- and time-intensive task that requires extensive experience. The field of automated machine learning Auto Machine Learning relies on automating this task. Auto Machine Learning tools enable novice users to create useful machine learning units, while experts can use them to free up valuable time for other tasks. This paper presents an objective method of forecasting the COVID-19 outbreak using Kalman Filter and Auto Machine Learning. We use a COVID-19 dataset of Ceará, one of the 27 federative units in Brazil. Ceará has more than 235,222 confirmed cases of COVID-19 and 8850 deaths due to the disease. The TPOT automobile model showed the best result with a 0.99 of R 2 score.

Han Tao, Gois Francisco Nauber Bernardo, Oliveira Rams├ęs, Prates Luan Rocha, Porto Magda Moura de Almeida


AutoML, COVID-19, Forecast, Kalman Filter

General General

Pupillary Responses for Cognitive Load Measurement to Classify Difficulty Levels in an Educational Video Game: Empirical Study.

In JMIR serious games

BACKGROUND : A learning task recurrently perceived as easy (or hard) may cause poor learning results. Gamer data such as errors, attempts, or time to finish a challenge are widely used to estimate the perceived difficulty level. In other contexts, pupillometry is widely used to measure cognitive load (mental effort); hence, this may describe the perceived task difficulty.

OBJECTIVE : This study aims to assess the use of task-evoked pupillary responses to measure the cognitive load measure for describing the difficulty levels in a video game. In addition, it proposes an image filter to better estimate baseline pupil size and to reduce the screen luminescence effect.

METHODS : We conducted an experiment that compares the baseline estimated from our filter against that estimated from common approaches. Then, a classifier with different pupil features was used to classify the difficulty of a data set containing information from students playing a video game for practicing math fractions.

RESULTS : We observed that the proposed filter better estimates a baseline. Mauchly's test of sphericity indicated that the assumption of sphericity had been violated (χ214=0.05; P=.001); therefore, a Greenhouse-Geisser correction was used (ε=0.47). There was a significant difference in mean pupil diameter change (MPDC) estimated from different baseline images with the scramble filter (F5,78=30.965; P<.001). Moreover, according to the Wilcoxon signed rank test, pupillary response features that better describe the difficulty level were MPDC (z=-2.15; P=.03) and peak dilation (z=-3.58; P<.001). A random forest classifier for easy and hard levels of difficulty showed an accuracy of 75% when the gamer data were used, but the accuracy increased to 87.5% when pupillary measurements were included.

CONCLUSIONS : The screen luminescence effect on pupil size is reduced with a scrambled filter on the background video game image. Finally, pupillary response data can improve classifier accuracy for the perceived difficulty of levels in educational video games.

Mitre-Hernandez Hugo, Covarrubias Carrillo Roberto, Lara-Alvarez Carlos


educational technology, machine learning, metacognitive monitoring, pupil, video games

General General

Prediction of throwing distance in the men's and women's javelin final at the 2017 IAAF world championships.

In Journal of sports sciences ; h5-index 52.0

The purpose of this study was to use regularised regression models to identify the most important biomechanical predictors of throwing distance in elite male (M) and female (F) javelin throwers at the 2017 IAAF world championships. Biomechanical data from 13 male and 12 female javelin throwers who competed at the 2017 IAAF world championships were obtained from an official scientific IAAF report. Regularised regression models were used to investigate the associations between throwing distance and release parameters, whole-body kinematic and joint-level kinematic data. The regularised regression models identified two biomechanical predictors of throwing distances in both M and F javelin throwers: release velocity and knee flexion angle of the support leg at the moment of javelin release. In addition, the length of the delivery stride was an important predictor of throwing distance in M throwers, whereas the javelin's attitude angle and the distance between the whole-body centre of mass and the centre of mass of the back foot at the beginning of the delivery phase were important predictors of throwing distance in F throwers.

Krzyszkowski John, Kipp Kristof


Biomechanics, LASSO, machine learning, regularised regression, sports

oncology Oncology

Investigating the potential of deep learning for patient-specific quality assurance of salivary gland contours using EORTC-1219-DAHANCA-29 clinical trial data.

In Acta oncologica (Stockholm, Sweden)

INTRODUCTION : Manual quality assurance (QA) of radiotherapy contours for clinical trials is time and labor intensive and subject to inter-observer variability. Therefore, we investigated whether deep-learning (DL) can provide an automated solution to salivary gland contour QA.

MATERIAL AND METHODS : DL-models were trained to generate contours for parotid (PG) and submandibular glands (SMG). Sørensen-Dice coefficient (SDC) and Hausdorff distance (HD) were used to assess agreement between DL and clinical contours and thresholds were defined to highlight cases as potentially sub-optimal. 3 types of deliberate errors (expansion, contraction and displacement) were gradually applied to a test set, to confirm that SDC and HD were suitable QA metrics. DL-based QA was performed on 62 patients from the EORTC-1219-DAHANCA-29 trial. All highlighted contours were visually inspected.

RESULTS : Increasing the magnitude of all 3 types of errors resulted in progressively severe deterioration/increase in average SDC/HD. 19/124 clinical PG contours were highlighted as potentially sub-optimal, of which 5 (26%) were actually deemed clinically sub-optimal. 2/19 non-highlighted contours were false negatives (11%). 15/69 clinical SMG contours were highlighted, with 7 (47%) deemed clinically sub-optimal and 2/15 non-highlighted contours were false negatives (13%). For most incorrectly highlighted contours causes for low agreement could be identified.

CONCLUSION : Automated DL-based contour QA is feasible but some visual inspection remains essential. The substantial number of false positives were caused by sub-optimal performance of the DL-model. Improvements to the model will increase the extent of automation and reliability, facilitating the adoption of DL-based contour QA in clinical trials and routine practice.

Nijhuis Hanne, van Rooij Ward, Gregoire Vincent, Overgaard Jens, Slotman Berend J, Verbakel Wilko F, Dahele Max


Clinical trial, Deep learning, Quality assurance, Radiotherapy, Salivary glands, Segmentation

General General

Using machine learning method to identify MYLK as a novel marker to predict biochemical recurrence in prostate cancer.

In Biomarkers in medicine

Aim: This study aims to identify novel marker to predict biochemical recurrence (BCR) in prostate cancer patients after radical prostatectomy with negative surgical margin. Materials & methods: The Cancer Genome Atlas database, Gene Expression Omnibus database and Cancer Cell Line Encyclopedia database were employed. The ensemble support vector machine-recursive feature elimination method was performed to select crucial gene for BCR. Results: We identified MYLK as a novel and independent biomarker for BCR in The Cancer Genome Atlas training cohort and confirmed in four independent Gene Expression Omnibus validation cohorts. Multi-omic analysis suggested that MYLK was a DNA methylation-driven gene. Additionally, MYLK had significant positive correlations with immune infiltrations. Conclusion:MYLK was identified and validated as a novel, robust and independent biomarker for BCR in prostate cancer.

Qiao Peng, Zhang Di, Zeng Song, Wang Yicun, Wang Biao, Hu Xiaopeng


biochemical recurrence, bioinformatics analysis, biomarker, prostate cancer