ArXiv Preprint
Synthetic data generation is a promising solution to address privacy issues
with the distribution of sensitive health data. Recently, diffusion models have
set new standards for generative models for different data modalities. Also
very recently, structured state space models emerged as a powerful modeling
paradigm to capture long-term dependencies in time series. We put forward
SSSD-ECG, as the combination of these two technologies, for the generation of
synthetic 12-lead electrocardiograms conditioned on more than 70 ECG
statements. Due to a lack of reliable baselines, we also propose conditional
variants of two state-of-the-art unconditional generative models. We thoroughly
evaluate the quality of the generated samples, by evaluating pretrained
classifiers on the generated data and by evaluating the performance of a
classifier trained only on synthetic data, where SSSD-ECG clearly outperforms
its GAN-based competitors. We demonstrate the soundness of our approach through
further experiments, including conditional class interpolation and a clinical
Turing test demonstrating the high quality of the SSSD-ECG samples across a
wide range of conditions.
Juan Miguel Lopez Alcaraz, Nils Strodthoff
2023-01-19