With the increasing size of datasets used in medical imaging research, the need for automated data curation is arising. One important data curation task is the structured organization of a dataset for preserving integrity and ensuring reusability. Therefore, we investigated whether this data organization step can be automated. To this end, we designed a convolutional neural network (CNN) that automatically recognizes eight different brain magnetic resonance imaging (MRI) scan types based on visual appearance. Thus, our method is unaffected by inconsistent or missing scan metadata. It can recognize pre-contrast T1-weighted (T1w),post-contrast T1-weighted (T1wC), T2-weighted (T2w), proton density-weighted (PDw) and derived maps (e.g. apparent diffusion coefficient and cerebral blood flow). In a first experiment,we used scans of subjects with brain tumors: 11065 scans of 719 subjects for training, and 2369 scans of 192 subjects for testing. The CNN achieved an overall accuracy of 98.7%. In a second experiment, we trained the CNN on all 13434 scans from the first experiment and tested it on 7227 scans of 1318 Alzheimer's subjects. Here, the CNN achieved an overall accuracy of 98.5%. In conclusion, our method can accurately predict scan type, and can quickly and automatically sort a brain MRI dataset virtually without the need for manual verification. In this way, our method can assist with properly organizing a dataset, which maximizes the shareability and integrity of the data.
van der Voort Sebastian R, Smits Marion, Klein Stefan
BIDS, Brain imaging, DICOM, Data curation, Machine learning, Magnetic resonance imaging