Discriminative convolutional neural networks (CNNs), for which a voxel-wise
conditional Multinoulli distribution is assumed, have performed well in many
brain lesion segmentation tasks. For a trained discriminative CNN to be used in
clinical practice, the patient's radiological features are inputted into the
model, in which case a conditional distribution of segmentations is produced.
Capturing the uncertainty of the predictions can be useful in deciding whether
to abandon a model, or choose amongst competing models. In practice, however,
we never know the ground truth segmentation, and therefore can never know the
true model variance. In this work, segmentation sampling on discriminative CNNs
is used to assess a trained model's robustness by analyzing the inter-sample
dice distribution on a new patient solely based on their magnetic resonance
(MR) images. Furthermore, by demonstrating the inter-sample Dice observations
are independent and identically distributed with a finite mean and variance
under certain conditions, a rigorous confidence based decision rule is proposed
to decide whether to reject or accept a CNN model for a particular patient.
Applied to the ISLES 2015 (SISS) dataset, the model identified 7 predictions as
non-robust, and the average Dice coefficient calculated on the remaining brains
improved by 12 percent.