ArXiv Preprint
Identifying suicidality including suicidal ideation, attempts, and risk
factors in electronic health record data in clinical notes is difficult. A
major difficulty is the lack of training samples given the small number of true
positive instances among the increasingly large number of patients being
screened. This paper describes a novel methodology that identifies suicidality
in clinical notes by addressing this data sparsity issue through zero-shot
learning. U.S. Veterans Affairs clinical notes served as data. The training
dataset label was determined using diagnostic codes of suicide attempt and
self-harm. A base string associated with the target label of suicidality was
used to provide auxiliary information by narrowing the positive training cases
to those containing the base string. A deep neural network was trained by
mapping the training documents contents to a semantic space. For comparison, we
trained another deep neural network using the identical training dataset labels
and bag-of-words features. The zero shot learning model outperformed the
baseline model in terms of AUC, sensitivity, specificity, and positive
predictive value at multiple probability thresholds. In applying a 0.90
probability threshold, the methodology identified notes not associated with a
relevant ICD 10 CM code that documented suicidality, with 94 percent accuracy.
This new method can effectively identify suicidality without requiring manual
annotation.
Terri Elizabeth Workman, Joseph L. Goulet, Cynthia A. Brandt, Allison R. Warren, Jacob Eleazer, Melissa Skanderson, Luke Lindemann, John R. Blosnich, John O Leary, Qing Zeng Treitler
2023-01-09