Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In IEEE journal of biomedical and health informatics

Over 34 million people in the US have diabetes, a major cause of blindness, renal failure, and amputations. Machine learning (ML) models can predict high-risk patients to help prevent adverse outcomes. Selecting the 'best' prediction model for a given disease, population, and clinical application is challenging due to the hundreds of health-related ML models in the literature and the increasing availability of ML methodologies. To support this decision process, we developed the Selection of Machine-learning Algorithms with ReplicaTions (SMART) Framework that integrates building and selecting ML models with decision theory. We build ML models and estimate performance for multiple plausible future populations with a replicated nested cross-validation technique. We rank ML models by simulating decision-maker priorities, using a range of accuracy measures (e.g., AUC) and robustness metrics from decision theory (e.g., minimax Regret). We present the SMART Framework through a case study on the microvascular complications of diabetes using data from the ACCORD clinical trial. We compare selections made by risk-averse, -neutral, and -seeking decision-makers, finding agreement in 80% of the risk-averse and risk-neutral selections, with the risk-averse selections showing consistency for a given complication. We also found that the models that best predicted outcomes in the validation set were those with low performance variance on the testing set, indicating a risk-averse approach in model selection is ideal when there is a potential for high population feature variability. The SMART Framework is a powerful, interactive tool that incorporates various ML algorithms and stakeholder preferences, generalizable to new data and technological advancements.

Swan Breanna Patrice, Mayorga Maria E, Ivy Julie