Patient representation learning refers to learning a dense mathematical
representation of a patient that encodes meaningful information from Electronic
Health Records (EHRs). This is generally performed using advanced deep learning
methods. This study presents a systematic review of this field and provides
both qualitative and quantitative analyses from a methodological perspective.
We identified studies developing patient representations from EHRs with deep
learning methods from MEDLINE, EMBASE, Scopus, the Association for Computing
Machinery (ACM) Digital Library, and Institute of Electrical and Electronics
Engineers (IEEE) Xplore Digital Library. After screening 362 articles, 48
papers were included for a comprehensive data collection. We noticed a typical
workflow starting with feeding raw data, applying deep learning models, and
ending with clinical outcome predictions as evaluations of the learned
representations. Specifically, learning representations from structured EHR
data was dominant (36 out of 48 studies). Recurrent Neural Networks were widely
applied as the deep learning architecture (LSTM: 13 studies, GRU: 11 studies).
Disease prediction was the most common application and evaluation (30 studies).
Benchmark datasets were mostly unavailable (28 studies) due to privacy concerns
of EHR data, and code availability was assured in 20 studies. We show the
importance and feasibility of learning comprehensive representations of patient
EHR data through a systematic review. Advances in patient representation
learning techniques will be essential for powering patient-level EHR analyses.
Future work will still be devoted to leveraging the richness and potential of
available EHR data. Knowledge distillation and advanced learning techniques
will be exploited to assist the capability of learning patient representation
Yuqi Si, Jingcheng Du, Zhao Li, Xiaoqian Jiang, Timothy Miller, Fei Wang, W. Jim Zheng, Kirk Roberts