ArXiv Preprint
Surgery is the only viable treatment for cataract patients with visual acuity
(VA) impairment. Clinically, to assess the necessity of cataract surgery,
accurately predicting postoperative VA before surgery by analyzing multi-view
optical coherence tomography (OCT) images is crucially needed. Unfortunately,
due to complicated fundus conditions, determining postoperative VA remains
difficult for medical experts. Deep learning methods for this problem were
developed in recent years. Although effective, these methods still face several
issues, such as not efficiently exploring potential relations between
multi-view OCT images, neglecting the key role of clinical prior knowledge
(e.g., preoperative VA value), and using only regression-based metrics which
are lacking reference. In this paper, we propose a novel Cross-token
Transformer Network (CTT-Net) for postoperative VA prediction by analyzing both
the multi-view OCT images and preoperative VA. To effectively fuse multi-view
features of OCT images, we develop cross-token attention that could restrict
redundant/unnecessary attention flow. Further, we utilize the preoperative VA
value to provide more information for postoperative VA prediction and
facilitate fusion between views. Moreover, we design an auxiliary
classification loss to improve model performance and assess VA recovery more
sufficiently, avoiding the limitation by only using the regression metrics. To
evaluate CTT-Net, we build a multi-view OCT image dataset collected from our
collaborative hospital. A set of extensive experiments validate the
effectiveness of our model compared to existing methods in various metrics.
Code is available at: https://github.com/wjh892521292/Cataract OCT.
Jinhong Wang, Jingwen Wang, Tingting Chen, Wenhao Zheng, Zhe Xu, Xingdi Wu, Wen Xu, Haochao Ying, Danny Chen, Jian Wu
2022-12-12