Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In PLoS neglected tropical diseases ; h5-index 79.0

BACKGROUND : Though significant progress in disease elimination has been made over the past decades, trachoma is the leading infectious cause of blindness globally. Further efforts in trachoma elimination are paradoxically being limited by the relative rarity of the disease, which makes clinical training for monitoring surveys difficult. In this work, we evaluate the plausibility of an Artificial Intelligence model to augment or replace human image graders in the evaluation/diagnosis of trachomatous inflammation-follicular (TF).

METHODS : We utilized a dataset consisting of 2300 images with a 5% positivity rate for TF. We developed classifiers by implementing two state-of-the-art Convolutional Neural Network architectures, ResNet101 and VGG16, and applying a suite of data augmentation/oversampling techniques to the positive images. We then augmented our data set with additional images from independent research groups and evaluated performance.

RESULTS : Models performed well in minimizing the number of false negatives, given the constraint of the low numbers of images in which TF was present. The best performing models achieved a sensitivity of 95% and positive predictive value of 50-70% while reducing the number images requiring skilled grading by 66-75%. Basic oversampling and data augmentation techniques were most successful at improving model performance, while techniques that are grounded in clinical experience, such as highlighting follicles, were less successful.

DISCUSSION : The developed models perform well and significantly reduce the burden on graders by minimizing the number of false negative identifications. Further improvements in model skill will benefit from data sets with more TF as well as a range in image quality and image capture techniques used. While these models approach/meet the community-accepted standard for skilled field graders (i.e., Cohen's Kappa >0.7), they are insufficient to be deployed independently/clinically at this time; rather, they can be utilized to significantly reduce the burden on skilled image graders.

Socia Damien, Brady Christopher J, West Sheila K, Cockrell R Chase

2022-Dec-07