ArXiv Preprint
Artificial intelligence (AI) models are increasingly used in the medical
domain. However, as medical data is highly sensitive, special precautions to
ensure the protection of said data are required. The gold standard for privacy
preservation is the introduction of differential privacy (DP) to model
training. However, prior work has shown that DP has negative implications on
model accuracy and fairness. Therefore, the purpose of this study is to
demonstrate that the privacy-preserving training of AI models for chest
radiograph diagnosis is possible with high accuracy and fairness compared to
non-private training. N=193,311 high quality clinical chest radiographs were
retrospectively collected and manually labeled by experienced radiologists, who
assigned one or more of the following diagnoses: cardiomegaly, congestion,
pleural effusion, pneumonic infiltration and atelectasis, to each side (where
applicable). The non-private AI models were compared with privacy-preserving
(DP) models with respect to privacy-utility trade-offs (measured as area under
the receiver-operator-characteristic curve (AUROC)), and privacy-fairness
trade-offs (measured as Pearson-R or Statistical Parity Difference). The
non-private AI model achieved an average AUROC score of 0.90 over all labels,
whereas the DP AI model with a privacy budget of epsilon=7.89 resulted in an
AUROC of 0.87, i.e., a mere 2.6% performance decrease compared to non-private
training. The privacy-preserving training of diagnostic AI models can achieve
high performance with a small penalty on model accuracy and does not amplify
discrimination against age, sex or co-morbidity. We thus encourage
practitioners to integrate state-of-the-art privacy-preserving techniques into
medical AI model development.
Soroosh Tayebi Arasteh, Alexander Ziller, Christiane Kuhl, Marcus Makowski, Sven Nebelung, Rickmer Braren, Daniel Rueckert, Daniel Truhn, Georgios Kaissis
2023-02-03