Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In Multivariate behavioral research

Gradient tree boosting is a powerful machine learning technique that has shown good performance in predicting a variety of outcomes. However, when applied to hierarchical (e.g., longitudinal or clustered) data, the predictive performance of gradient tree boosting may be harmed by ignoring the hierarchical structure, and may be improved by accounting for it. Tree-based methods such as regression trees and random forests have already been extended to hierarchical data settings by combining them with the linear mixed effects model (MEM). In the present article, we add to this literature by proposing two algorithms to estimate a combination of the MEM and gradient tree boosting. We report on two simulation studies that (i) investigate the predictive performance of the two MEM boosting algorithms and (ii) compare them to standard gradient tree boosting, standard random forest, and other existing methods for hierarchical data (MEM, MEM random forests, model-based boosting, Bayesian additive regression trees [BART]). We found substantial improvements in the predictive performance of our MEM boosting algorithms over standard boosting when the random effects were non-negligible. MEM boosting as well as BART showed a predictive performance similar to the correctly specified MEM (i.e., the benchmark model), and overall outperformed the model-based boosting and random forest approaches.

Salditt Marie, Humberg Sarah, Nestler Steffen

2023-Jan-05

Mixed effects models, atypical observations, gradient boosting, longitudinal data, regression trees