一只眼睛就是您所需要的：轻巧的合奏，用于用单个编码器进行凝视估算

论文标题

一只眼睛就是您所需要的：轻巧的合奏，用于用单个编码器进行凝视估算

One Eye is All You Need: Lightweight Ensembles for Gaze Estimation with Single Encoders

论文作者

Athavale, Rishi, Motati, Lakshmi Sritan, Kalahasty, Rohan

论文摘要

近年来，凝视估计的准确性迅速增长。但是，这些模型通常无法利用不同的计算机视觉（CV）算法和技术（例如小型重新网络和成立网络和集成模型），这些算法和整体模型已被证明可以改善其他CV问题的结果。此外，大多数当前的凝视估计模型都需要使用眼睛或整个面部，而现实世界中的数据可能并不总是具有高分辨率的两只眼睛。因此，我们提出了一个凝视估计模型，该模型实现了重新网络和成立模型体系结构，并仅使用一个眼睛图像进行预测。此外，我们提出了一个集合校准网络，该网络将来自几个单个体系结构的预测用于特定于主题的预测。通过使用轻量级体系结构，我们在具有非常低模型参数计数的GazeCapture数据集上实现了高性能。当使用两只眼睛作为输入时，我们在未经校准的测试集上达到了1.591 cm的预测误差，并通过合奏校准模型达到1.439 cm。只有一只眼睛作为输入，我们仍然在未经校准的测试集上达到2.312 cm的平均预测误差，并通过集合校准模型达到1.951 cm。我们还注意到，测试集中右眼图像上的错误大大较低，这对于未来凝视估计的工具的设计可能很重要。

Gaze estimation has grown rapidly in accuracy in recent years. However, these models often fail to take advantage of different computer vision (CV) algorithms and techniques (such as small ResNet and Inception networks and ensemble models) that have been shown to improve results for other CV problems. Additionally, most current gaze estimation models require the use of either both eyes or an entire face, whereas real-world data may not always have both eyes in high resolution. Thus, we propose a gaze estimation model that implements the ResNet and Inception model architectures and makes predictions using only one eye image. Furthermore, we propose an ensemble calibration network that uses the predictions from several individual architectures for subject-specific predictions. With the use of lightweight architectures, we achieve high performance on the GazeCapture dataset with very low model parameter counts. When using two eyes as input, we achieve a prediction error of 1.591 cm on the test set without calibration and 1.439 cm with an ensemble calibration model. With just one eye as input, we still achieve an average prediction error of 2.312 cm on the test set without calibration and 1.951 cm with an ensemble calibration model. We also notice significantly lower errors on the right eye images in the test set, which could be important in the design of future gaze estimation-based tools.

下载PDF全文

下载文献需遵守相关版权规定

论文标题