Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In American journal of epidemiology ; h5-index 65.0

We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated healthcare claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in two integrated healthcare institutions in the Northwest United States. We used one site's manually reviewed gold standard outcomes data for model development and the other's for external validation based on cross-validated (cv) area under the receiver operating characteristic curve (cv AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared to 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cv-AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cv-AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cv AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cv-PPV of 79% and cv-sensitivity of 66% in development data had cv-PPV of 78% and cv-sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.

Carrell David S, Gruber Susan, Floyd James S, Bann Maralyssa A, Cushing-Haugen Kara L, Johnson Ron L, Graham Vina, Cronkite David J, Hazlehurst Brian L, Felcher Andrew H, Bejan Cosmin A, Kennedy Adee, Shinde Mayura, Karami Sara, Ma Yong, Stojanovic Danijela, Zhao Yueqin, Ball Robert, Nelson Jennifer

2022-Nov-04

Anaphylaxis, Electronic Health Records, Health Outcome Identification, Machine Learning, Supervised, Postmarketing Product Surveillance, Predictive Modeling