BACKGROUND : The National COVID-19 Chest Imaging Database (NCCID) is a centralized database containing mainly chest X-rays and computed tomography scans from patients across the UK. The objective of the initiative is to support a better understanding of the coronavirus SARS-CoV-2 disease (COVID-19) and the development of machine learning technologies that will improve care for patients hospitalized with a severe COVID-19 infection. This article introduces the training dataset, including a snapshot analysis covering the completeness of clinical data, and availability of image data for the various use-cases (diagnosis, prognosis, longitudinal risk). An additional cohort analysis measures how well the NCCID represents the wider COVID-19-affected UK population in terms of geographic, demographic, and temporal coverage.
FINDINGS : The NCCID offers high-quality DICOM images acquired across a variety of imaging machinery; multiple time points including historical images are available for a subset of patients. This volume and variety make the database well suited to development of diagnostic/prognostic models for COVID-associated respiratory conditions. Historical images and clinical data may aid long-term risk stratification, particularly as availability of comorbidity data increases through linkage to other resources. The cohort analysis revealed good alignment to general UK COVID-19 statistics for some categories, e.g., sex, whilst identifying areas for improvements to data collection methods, particularly geographic coverage.
CONCLUSION : The NCCID is a growing resource that provides researchers with a large, high-quality database that can be leveraged both to support the response to the COVID-19 pandemic and as a test bed for building clinically viable medical imaging models.
Cushnan Dominic, Bennett Oscar, Berka Rosalind, Bertolli Ottavia, Chopra Ashwin, Dorgham Samie, Favaro Alberto, Ganepola Tara, Halling-Brown Mark, Imreh Gergely, Jacob Joseph, Jefferson Emily, Lemarchand François, Schofield Daniel, Wyatt Jeremy C
COVID-19, SARS-CoV2, machine learning, medical imaging, thoracic imaging