ArXiv Preprint
Medical datasets often face the problem of data scarcity, as ground truth
labels must be generated by medical professionals. One mitigation strategy is
to pretrain deep learning models on large, unlabelled datasets with
self-supervised learning (SSL). Data augmentations are essential for improving
the generalizability of SSL-trained models, but they are typically handcrafted
and tuned manually. We use an adversarial model to generate masks as
augmentations for 12-lead electrocardiogram (ECG) data, where masks learn to
occlude diagnostically-relevant regions of the ECGs. Compared to random
augmentations, adversarial masking reaches better accuracy when transferring to
to two diverse downstream objectives: arrhythmia classification and gender
classification. Compared to a state-of-art ECG augmentation method 3KG,
adversarial masking performs better in data-scarce regimes, demonstrating the
generalizability of our model.
Jessica Y. Bo, Hen-Wei Huang, Alvin Chan, Giovanni Traverso
2022-11-15