In Computers in biology and medicine
This study proposes a framework for mining temporal patterns from Electronic Medical Records. A new scoring scheme based on the Wilson interval is provided to obtain frequent and predictive patterns, as well as to accelerate the mining process by reducing the number of patterns mined. This is combined with a case study using data from general practices in the Netherlands to identify children at risk of suffering from mental disorders. To develop an accurate model, feature engineering methods such as one hot encoding and frequency transformation are proposed, and the pattern selection is tailored to this type of clinical data. Six machine learning models are trained on five age groups, with XGBoost achieving the highest AUC values (0.75-0.79) with sensitivity and specificity above 0.7 and 0.6 respectively. An improvement is demonstrated by the models learning from patterns in addition to non-temporal features.
Półchłopek Olga, Koning Nynke R, Büchner Frederike L, Crone Mathilde R, Numans Mattijs E, Hoogendoorn Mark
Electronic medical records, General practice, Mental health classification, Pattern recognition, Temporal pattern mining