论文标题

蚂蚁:使用组合贝叶斯优化的现实世界自动化抗体设计

AntBO: Towards Real-World Automated Antibody Design with Combinatorial Bayesian Optimisation

论文作者

Khan, Asif, Cowen-Rivers, Alexander I., Grosnit, Antoine, Deik, Derrick-Goh-Xin, Robert, Philippe A., Greiff, Victor, Smorodina, Eva, Rawat, Puneet, Dreczkowski, Kamil, Akbar, Rahmad, Tutunov, Rasul, Bou-Ammar, Dany, Wang, Jun, Storkey, Amos, Bou-Ammar, Haitham

论文摘要

抗体是能够高度特异性分子识别的典型Y形多聚体蛋白。位于抗体可变链尖端的CDRH3区域主导抗原结合特异性。因此,设计最佳的抗原特异性CDRH3区域以开发治疗性抗体是优先的。但是,CDRH3序列空间的组合性质使得不可能使用计算方法详尽有效地搜索最佳的结合序列。在这里,我们提出\ texttt {antbo}:一个组合贝叶斯优化框架启用了CDRH3区域的有效\ textIt {在硅中的设计。理想情况下,抗体有望具有较高的目标特异性和发展性。我们引入了一个CDRH3信任区域,该区域将搜索限制为具有有利的开发性得分以实现这一目标的序列。为了进行基准测试,\ texttt {antbo}使用\ texttt {absolut!}软件套件作为黑盒甲骨文,以不受约束的方式〜\ citep {robert {robert2021none}为设计抗体\ textit {in Silico}的目标特异性和亲和力评分。在\ texttt {Absolut!}中使用的$ 159 $离散抗原执行的实验证明了\ texttt {antbo}在设计具有不同生物物理特性的CDRH3区域的好处。在$ 200 $对Black-Box Oracle的呼叫中,\ texttt {antbo}可以提出抗体序列,以优于从实验获得的690万次获得的CDRH3S和常用的遗传算法基线的最佳绑定序列。此外,\ texttt {antbo}在仅38个蛋白质设计中找到了很高的亲和力CDRH3序列,而不需要域知识。我们得出结论,\ texttt {antbo}使自动化抗体设计方法更接近于实际上可行的体外实验。

Antibodies are canonically Y-shaped multimeric proteins capable of highly specific molecular recognition. The CDRH3 region located at the tip of variable chains of an antibody dominates antigen-binding specificity. Therefore, it is a priority to design optimal antigen-specific CDRH3 regions to develop therapeutic antibodies. However, the combinatorial nature of CDRH3 sequence space makes it impossible to search for an optimal binding sequence exhaustively and efficiently using computational approaches. Here, we present \texttt{AntBO}: a combinatorial Bayesian optimisation framework enabling efficient \textit{in silico} design of the CDRH3 region. Ideally, antibodies are expected to have high target specificity and developability. We introduce a CDRH3 trust region that restricts the search to sequences with favourable developability scores to achieve this goal. For benchmarking, \texttt{AntBO} uses the \texttt{Absolut!} software suite as a black-box oracle to score the target specificity and affinity of designed antibodies \textit{in silico} in an unconstrained fashion~\citep{robert2021one}. The experiments performed for $159$ discretised antigens used in \texttt{Absolut!} demonstrate the benefit of \texttt{AntBO} in designing CDRH3 regions with diverse biophysical properties. In under $200$ calls to black-box oracle, \texttt{AntBO} can suggest antibody sequences that outperform the best binding sequence drawn from 6.9 million experimentally obtained CDRH3s and a commonly used genetic algorithm baseline. Additionally, \texttt{AntBO} finds very-high affinity CDRH3 sequences in only 38 protein designs whilst requiring no domain knowledge. We conclude \texttt{AntBO} brings automated antibody design methods closer to what is practically viable for in vitro experimentation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源