In International journal of medical informatics ; h5-index 49.0
OBJECTIVE : We aimed to identify machine learning (ML) models for type 2 diabetes (T2DM) prediction in community settings and determine their predictive performance.
METHOD : Systematic review of ML predictive modelling studies in 13 databases since 2009 was conducted. Primary outcomes included metrics of discrimination, calibration, and classification. Secondary outcomes included important variables, level of validation, and intended use of models. Meta-analysis of c-indices, subgroup analyses, meta-regression, publication bias assessments and sensitivity analyses were conducted.
RESULTS : Twenty-three studies (40 prediction models) were included. Studies with high-, moderate-, and low- risk of bias were 3, 14, and 6 respectively. All studies conducted internal validation whereas none conducted external validation of their models. Twenty studies provided classification metrics to varying extents whereas only 7 studies performed model calibration. Eighteen studies reported information on both the variables used for model development and the feature importance. Twelve studies highlighted potential applicability of their models for T2DM screening. Meta-analysis produced a good pooled c-index (0.812). Sources of heterogeneity were identified through subgroup analyses and meta-regression. Issues pertaining to methodological quality and reporting were observed.
CONCLUSIONS : We found evidence of good performance of ML models for T2DM prediction in the community. Improvements to methodology, reporting and validation are needed before they can be used at scale.
Silva Kushan De, Lee Wai Kit, Forbes Andrew, Demmer Ryan T, Barton Christopher, Enticott Joanne
Diabetes mellitus, Diagnosis, Machine learning, Meta-Analysis, Prognosis, Type 2