表面视觉变压器：基于注意的建模应用于皮质分析

论文标题

表面视觉变压器：基于注意的建模应用于皮质分析

Surface Vision Transformers: Attention-Based Modelling applied to Cortical Analysis

论文作者

Dahan, Simon, Fawaz, Abdulah, Williams, Logan Z. J., Yang, Chunhui, Coalson, Timothy S., Glasser, Matthew F., Edwards, A. David, Rueckert, Daniel, Robinson, Emma C.

论文摘要

将卷积神经网络（CNN）扩展到非欧几里得几何形状已导致多个用于研究歧管的框架。这些方法中的许多方法都显示出设计局限性，导致远程关联的建模较差，因为对不规则表面的卷积的概括是不平凡的。在计算机视觉中关注的成功的激励，我们将无卷积视觉变压器的方法转化为表面数据，以引入域 - 不合骨的体系结构，以研究投影到球形歧管上的任何表面数据。在这里，通过将球形数据表示为从细分的Icosphere提取的三角形斑块序列来实现表面斑块。变压器模型通过连续的多头自发层编码斑块的序列，同时保留序列分辨率。我们验证了所提出的表面视觉变压器（SIT）的性能，该任务是从发展中的人类连接组项目（DHCP）得出的皮质表面指标的表型回归任务。实验表明，在注册和未注册的数据上，SIT通常比表面CNN胜过表面CNN。对变压器注意图的分析为表征微妙的认知发展模式提供了强大的潜力。

The extension of convolutional neural networks (CNNs) to non-Euclidean geometries has led to multiple frameworks for studying manifolds. Many of those methods have shown design limitations resulting in poor modelling of long-range associations, as the generalisation of convolutions to irregular surfaces is non-trivial. Motivated by the success of attention-modelling in computer vision, we translate convolution-free vision transformer approaches to surface data, to introduce a domain-agnostic architecture to study any surface data projected onto a spherical manifold. Here, surface patching is achieved by representing spherical data as a sequence of triangular patches, extracted from a subdivided icosphere. A transformer model encodes the sequence of patches via successive multi-head self-attention layers while preserving the sequence resolution. We validate the performance of the proposed Surface Vision Transformer (SiT) on the task of phenotype regression from cortical surface metrics derived from the Developing Human Connectome Project (dHCP). Experiments show that the SiT generally outperforms surface CNNs, while performing comparably on registered and unregistered data. Analysis of transformer attention maps offers strong potential to characterise subtle cognitive developmental patterns.

下载PDF全文

下载文献需遵守相关版权规定

论文标题