ArXiv Preprint
Type 2 diabetes mellitus (T2DM) is one of the most common diseases and a
leading cause of death. The problem of early diagnosis of T2DM is challenging
and necessary to prevent serious complications. This study proposes a novel
neural network architecture for early T2DM prediction using multi-headed
self-attention and dense layers to extract features from historic diagnoses,
patient vitals, and demographics. The proposed technique is called the
Self-Attention for Comorbid Disease Net (SACDNet), achieving an accuracy of
89.3% and an F1-Score of 89.1%, having a 1.6% increased accuracy and 1.3%
increased f1-score compared to the baseline techniques. Monte Carlo (MC)
Dropout is applied to the SACEDNet to get a bayesian approximation. A T2DM
prediction framework based on the MC Dropout SACDNet is proposed to quantize
the uncertainty associated with the predictions. A T2DM prediction dataset is
also built as part of this study which is based on real-world routine
Electronic Health Record (EHR) data comprising 4,124 diabetic and 181,767
non-diabetic examples, collected from 295 different EHR systems running in
different parts of the United States of America. This dataset is further used
to evaluate 7 different machine learning and 3 deep learning-based models.
Finally, a detailed analysis of the fairness of every technique against
different patient demographic groups is performed to validate the unbiased
generalization of the techniques and the diversity of the data.
Tayyab Nasir, Muhammad Kamran Malik
2023-01-12