Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology

Background Machine learning (ML) approaches facilitate risk prediction model development using high-dimensional predictors and higher-order interactions at the cost of model interpretability and transparency. We compared the relative predictive performance of statistical and ML models to guide modeling strategy selection for surveillance mammography outcomes in women with a personal history of breast cancer (PHBC). Methods We cross-validated 7 risk prediction models for two surveillance outcomes, failure (breast cancer within 12 months of a negative surveillance mammogram) and benefit (surveillance-detected breast cancer). We included 9447 mammograms (495 failures, 1414 benefits and 7538 non-events) from years 1996-2017 using a 1:4 matched case-control samples of women with PHBC in the Breast Cancer Surveillance Consortium. We assessed model performance of conventional regression, regularized regressions (LASSO and elastic-net) and ML methods (random forests and gradient boosting machines) by evaluating their calibration and, among well-calibrated models, comparing the area under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CI). Results LASSO and elastic-net consistently provided well-calibrated predicted risks for surveillance failure and benefit. The AUCs of LASSO and elastic-net were both 0.63 (95%CI 0.60-0.66) for surveillance failure and 0.66 (95%CI 0.64-0.68) for surveillance benefit, the highest among well-calibrated models. Conclusions For predicting breast cancer surveillance mammography outcomes, regularized regression outperformed other modeling approaches and balanced the trade-off between model flexibility and interpretability. Impact: Regularized regression may be preferred for developing risk prediction models in other contexts with rare outcomes, similar training sample sizes, and low dimensional features.

Su Yu-Ru, Buist Diana Sm, Lee Janie M, Ichikawa Laura, Miglioretti Diana L, Aiello Bowles Erin J, Wernli Karen J, Kerlikowske Karla, Tosteson Anna, Lowry Kathryn P, Henderson Louise M, Sprague Brian L, Hubbard Rebecca A

2023-Jan-25