In Psychological medicine

BACKGROUND : Investigation of personality traits and pathology in large, generalizable clinical cohorts has been hindered by inconsistent assessment and failure to consider a range of personality disorders (PDs) simultaneously.

METHODS : We applied natural language processing (NLP) of electronic health record notes to characterize a psychiatric inpatient cohort. A set of terms reflecting personality trait domains were derived, expanded, and then refined based on expert consensus. Latent Dirichlet allocation was used to score notes to estimate the extent to which any given note reflected PD topics. Regression models were used to examine the relationship of these estimates with sociodemographic features and length of stay.

RESULTS : Among 3623 patients with 4702 admissions, being male, non-white, having a low burden of medical comorbidity, being admitted through the emergency department, and having public insurance were independently associated with greater levels of disinhibition, detachment, and psychoticism. Being female, white, and having private insurance were independently associated with greater levels of negative affectivity. The presence of disinhibition, psychoticism, and negative affectivity were each significantly associated with a longer stay, while detachment was associated with a shorter stay.

CONCLUSIONS : Personality features can be systematically and scalably measured using NLP in the inpatient setting, and some of these features associate with length of stay. Developing treatment strategies for patients scoring high in certain personality dimensions may facilitate more efficient, targeted interventions, and may help reduce the impact of personality features on mental health service utilization.

Barroilhet Sergio A, Pellegrini Amelia M, McCoy Thomas H, Perlis Roy H


Electronic health record, length of stay, machine learning, natural language processing, personality disorder