论文标题
通过变压器网络跟踪关键点
Keypoints Tracking via Transformer Networks
论文作者
论文摘要
在本论文中,我们提出了一项开创性的工作,以使用变压器网络在跨图像的稀疏关键点上进行开创性的工作。尽管使用图神经网络和最近的变压器网络对基于深度学习的关键匹配进行了广泛研究,但它们仍然相对较慢,无法实时操作,并且对关键点检测器的重复性较差特别敏感。为了解决这些缺点,我们建议研究实时和强大的关键点跟踪的特定情况。具体来说,我们提出了一种新颖的体系结构,可确保对视频序列连续图像之间跟踪关键点的快速且可靠的估计。我们的方法利用了最近的计算机视觉突破,即视觉变压器网络。我们的方法由两个连续的阶段组成,是一个粗略的匹配,然后是按KePoints对应关系预测的精细定位。通过各种实验,我们证明了我们的方法取得了竞争成果,并证明了针对不利条件的高鲁棒性,例如照明变化,遮挡和观点差异。
In this thesis, we propose a pioneering work on sparse keypoints tracking across images using transformer networks. While deep learning-based keypoints matching have been widely investigated using graph neural networks - and more recently transformer networks, they remain relatively too slow to operate in real-time and are particularly sensitive to the poor repeatability of the keypoints detectors. In order to address these shortcomings, we propose to study the particular case of real-time and robust keypoints tracking. Specifically, we propose a novel architecture which ensures a fast and robust estimation of the keypoints tracking between successive images of a video sequence. Our method takes advantage of a recent breakthrough in computer vision, namely, visual transformer networks. Our method consists of two successive stages, a coarse matching followed by a fine localization of the keypoints' correspondences prediction. Through various experiments, we demonstrate that our approach achieves competitive results and demonstrates high robustness against adverse conditions, such as illumination change, occlusion and viewpoint differences.