In Journal of the Chinese Medical Association : JCMA
BACKGROUND : The prevalence of nonalcoholic fatty liver disease is increasing over time worldwide, with similar trends to those of diabetes and obesity. A liver biopsy, the gold standard of diagnosis, is not favored due to its invasiveness. Meanwhile, noninvasive evaluation methods of fatty liver are still either very expensive or demonstrate poor diagnostic performances, thus, limiting their applications. We developed neural network-based models to assess fatty liver and classify the severity using B-mode ultrasound (US) images.
METHODS : We followed STARD guidelines to report this study. In this retrospective study, we utilized B-mode US images from a consecutive series of patients to develop four-class, two-class, and three-class diagnostic prediction models. The images were eligible if confirmed by at least two gastroenterologists. We compared pretrained convolutional neural network models, consisting of VGG19, ResNet-50 v2, MobileNet v2, Xception, and Inception v2. For validation, we utilized 20% of the dataset resulting in >100 images for each severity category.
RESULTS : There were 21,855 images from 2,070 patients classified as normal (N = 11,307), mild (N = 4,467), moderate (N = 3,155), or severe steatosis (N = 2,926). We used ResNet-50 v2 for the final model as the best ones. The areas under the receiver operating characteristic curves were 0.974 (mild steatosis vs. others), 0.971 (moderate steatosis vs. others), 0.981 (severe steatosis vs. others), 0.985 (any severity vs. normal), and 0.996 (moderate-to-severe steatosis/clinically abnormal vs. normal-to-mild steatosis/clinically normal).
CONCLUSION : Our deep learning models achieved comparable predictive performances to the most accurate, yet expensive, noninvasive diagnostic methods for fatty liver. Because of the discriminative ability, including for mild steatosis, significant impacts on clinical applications for fatty liver are expected. However, we need to overcome machine-dependent variation, motion artifacts, lacking of second confirmation from any other tools, and hospital-dependent regional bias.
Chou Tsung-Hsien, Yeh Hsing-Jung, Chang Chun-Chao, Tang Jui-Hsiang, Kao Wei-Yu, Su I-Chia, Li Chien-Hung, Chang Wei-Hao, Huang Chun-Kai, Sufriyana Herdiantri, Su Emily Chia-Yu