ArXiv Preprint
Gaze tracking is a valuable tool with a broad range of applications in
various fields, including medicine, psychology, virtual reality, marketing, and
safety. Therefore, it is essential to have gaze tracking software that is
cost-efficient and high-performing. Accurately predicting gaze remains a
difficult task, particularly in real-world situations where images are affected
by motion blur, video compression, and noise. Super-resolution has been shown
to improve image quality from a visual perspective. This work examines the
usefulness of super-resolution for improving appearance-based gaze tracking. We
show that not all SR models preserve the gaze direction. We propose a two-step
framework based on SwinIR super-resolution model. The proposed method
consistently outperforms the state-of-the-art, particularly in scenarios
involving low-resolution or degraded images. Furthermore, we examine the use of
super-resolution through the lens of self-supervised learning for gaze
prediction. Self-supervised learning aims to learn from unlabelled data to
reduce the amount of required labeled data for downstream tasks. We propose a
novel architecture called SuperVision by fusing an SR backbone network to a
ResNet18 (with some skip connections). The proposed SuperVision method uses 5x
less labeled data and yet outperforms, by 15%, the state-of-the-art method of
GazeTR which uses 100% of training data.
Galen O’Shea, Majid Komeili
2023-03-17