*In Journal of general internal medicine **; h5-index 57.0 *

**INTRODUCTION** : Sodium glucose co-transporter-2 inhibitors (SGLT2) are commonly prescribed to patients with type 2 diabetes mellitus, but can increase the risk of diabetic ketoacidosis. Identifying patients prone to diabetic ketoacidosis may help mitigate this risk.

**METHODS** : We conducted a population-based cohort study of adults initiating SGLT2 inhibitor use from 2013 through 2017. The primary objective was to identify potential predictors of diabetic ketoacidosis. Two machine-learning methods were applied to model high-dimensional pre-exposure data: gradient boosted trees and least absolute shrinkage and selection operator (LASSO) regression. We rank ordered the variables produced from LASSO by the size of their estimated coefficient (largest to smallest). With gradient boosted trees, a relative importance measure for each variable is provided rather than a coefficient. The "top variables" were identified after reviewing the distributions of the effect estimates from LASSO and gradient boosted trees to identify where there was a substantial decrease in variable importance. The identified predictors were then assessed in a logistic regression model and reported as odds ratios (ORs) with 95% confidence intervals (CIs).

**RESULTS** : We identified 111,442 adults who started SGLT2 inhibitor use. The mean age was 57 years, 44% were female, the mean hemoglobin A1C was 8.7%, and the mean creatinine was 0.89 mg/dL. During a mean follow-up of 180 days, 192 patients (0.2%, i.e., 2 per 1000) were diagnosed and hospitalized with diabetic ketoacidosis (DKA) and 475 (0.4%, i.e., 4 per 1000) were diagnosed in either an inpatient or outpatient setting. Using gradient boosted trees, the strongest predictors were prior DKA, baseline hemoglobin A1C level, baseline creatinine level, use of medications for dementia, and baseline bicarbonate level. Using LASSO regression not including laboratory test results due to missing data, the strongest predictors were prior DKA, digoxin use, use of medications for dementia, and recent hypoglycemia. The logistic regression model incorporating the variables identified from gradient boosted trees and LASSO regression suggested the following pre-exposure characteristics had the strongest association with a hospitalization for DKA: use of dementia medications (OR = 7.76, 95% CI 2.60, 23.1), prior intracranial hemorrhage (OR = 11.5, 95% CI 1.46, 91.1), a prior diagnosis of hypoglycemia (OR = 5.41, 95% CI 1.92,15.3), prior DKA (OR = 2.45, 95% CI 0.33, 18.0), digoxin use (OR = 4.00, 95% CI 1.21, 13.2), a baseline hemoglobin A1C above 10% (OR = 3.14, 95% CI 1.95, 5.06), and baseline bicarbonate below 18 mmol/L (OR 5.09, 95% CI 1.58, 16.4).

**CONCLUSION** : Diabetic ketoacidosis affected approximately 2 per 1000 patients starting to use an SGLT2 inhibitor. We identified both anticipated, e.g., low baseline serum bicarbonate, and unanticipated, e.g., digoxin, dementia medications, risk factors for SGLT2 inhibitor-induced DKA.

*Fralick Michael, Redelmeier Donald A, Patorno Elisabetta, Franklin Jessica M, Razak Fahad, Gomes Tara, Schneeweiss Sebastian*

*2021-Feb-09*