In European radiology ; h5-index 62.0
OBJECTIVES : We aimed to develop and validate a deep learning system (DLS) by using an auxiliary section that extracts and outputs specific ultrasound diagnostic features to improve the explainable, clinical relevant utility of using DLS for detecting NAFLD.
METHODS : In a community-based study of 4144 participants with abdominal ultrasound scan in Hangzhou, China, we sampled 928 (617 [66.5%] females, mean age: 56 years ± 13 [standard deviation]) participants (2 images per participant) to develop and validate DLS, a two-section neural network (2S-NNet). Radiologists' consensus diagnosis classified hepatic steatosis as none steatosis, mild, moderate, and severe. We also explored the NAFLD detection performance of six one-section neural network models and five fatty liver indices on our data set. We further evaluated the influence of participants' characteristics on the correctness of 2S-NNet by logistic regression.
RESULTS : Area under the curve (AUROC) of 2S-NNet for hepatic steatosis was 0.90 for ≥ mild, 0.85 for ≥ moderate, and 0.93 for severe steatosis, and was 0.90 for NAFLD presence, 0.84 for moderate to severe NAFLD, and 0.93 for severe NAFLD. The AUROC of NAFLD severity was 0.88 for 2S-NNet, and 0.79-0.86 for one-section models. The AUROC of NAFLD presence was 0.90 for 2S-NNet, and 0.54-0.82 for fatty liver indices. Age, sex, body mass index, diabetes, fibrosis-4 index, android fat ratio, and skeletal muscle via dual-energy X-ray absorptiometry had no significant impact on the correctness of 2S-NNet (p > 0.05).
CONCLUSIONS : By using two-section design, 2S-NNet had improved the performance for detecting NAFLD with more explainable, clinical relevant utility than using one-section design.
KEY POINTS : • Based on the consensus review derived from radiologists, our DLS (2S-NNet) had an AUROC of 0.88 by using two-section design and yielded better performance for detecting NAFLD than using one-section design with more explainable, clinical relevant utility. • The 2S-NNet outperformed five fatty liver indices with the highest AUROCs (0.84-0.93 vs. 0.54-0.82) for different NAFLD severity screening, indicating screening utility of deep learning-based radiology may perform better than blood biomarker panels in epidemiology. • The correctness of 2S-NNet was not significantly influenced by individual's characteristics, including age, sex, body mass index, diabetes, fibrosis-4 index, android fat ratio, and skeletal muscle via dual-energy X-ray absorptiometry.
Yang Yang, Liu Jing, Sun Changxuan, Shi Yuwei, Hsing Julianna C, Kamya Aya, Keller Cody Auston, Antil Neha, Rubin Daniel, Wang Hongxia, Ying Haochao, Zhao Xueyin, Wu Yi-Hsuan, Nguyen Mindie, Lu Ying, Yang Fei, Huang Pinton, Hsing Ann W, Wu Jian, Zhu Shankuan
2023-Mar-09
Convolutional neural networks, Deep learning, Fatty liver indices, Nonalcoholic fatty liver disease, Ultrasound imaging