论文标题
用于T细胞受体工程的Wasserstein自动编码器
Disentangled Wasserstein Autoencoder for T-Cell Receptor Engineering
论文作者
论文摘要
在蛋白质生物物理学中,功能上重要的残基(形成活性位点或结合表面)与创建整体结构(折叠)(折叠)之间的分离是一个建立了良好且基本的概念。识别和修改这些功能位点对于蛋白质工程至关重要,但在计算上是非平凡的,需要重要的领域知识。为了从数据驱动的角度自动化此过程,我们提出了一个带有辅助分类器的解开的Wasserstein AutoCododer,该辅助分类器将与函数相关的模式从其余的模式与理论保证分离。这可以使一通蛋白序列编辑并提高对所涉及的序列和编辑作用的理解。为了证明其有效性,我们将其应用于T细胞受体(TCRS),这是一种良好的结构功能案例。我们表明,我们的方法可用于改变TCR的功能,而无需更改结构性主链,超过了几种发电质量和效率的几种竞争方法,并且仅需要基线模型所需的运行时间的10%。据我们所知,这是第一种利用TCR工程的分解表示形式的方法。
In protein biophysics, the separation between the functionally important residues (forming the active site or binding surface) and those that create the overall structure (the fold) is a well-established and fundamental concept. Identifying and modifying those functional sites is critical for protein engineering but computationally non-trivial, and requires significant domain knowledge. To automate this process from a data-driven perspective, we propose a disentangled Wasserstein autoencoder with an auxiliary classifier, which isolates the function-related patterns from the rest with theoretical guarantees. This enables one-pass protein sequence editing and improves the understanding of the resulting sequences and editing actions involved. To demonstrate its effectiveness, we apply it to T-cell receptors (TCRs), a well-studied structure-function case. We show that our method can be used to alter the function of TCRs without changing the structural backbone, outperforming several competing methods in generation quality and efficiency, and requiring only 10% of the running time needed by baseline models. To our knowledge, this is the first approach that utilizes disentangled representations for TCR engineering.