论文标题

部分可观测时空混沌系统的无模型预测

Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings

论文作者

Yip, Jia Qi, Ng, Dianwen, Ma, Bin, Pervushin, Konstantin, Chng, Eng Siong

论文摘要

核磁共振(NMR)用于结构生物学,以实验确定蛋白质的结构,蛋白质用于许多生物学领域,并且是药物发育的重要组成部分。不幸的是,NMR数据可能需要每样本收集数千美元,并且可能需要花费数周的时间才能将观察到的共振分配给特定的化学基团。因此,对NMR社区的兴趣越来越多,他们使用深度学习来自动化NMR数据注释。由于NMR和音频数据之间的相似性,我们建议在声学信号处理中使用的方法也可以应用于NMR。使用模拟的氨基酸数据集,我们表明,通过用可训练的卷积编码器交换过滤器库,可以将来自扬声器验证模型的声学信号嵌入在2D NMR光谱中通过将每个氨基酸视为独特的扬声器,用于2D NMR光谱中的氨基酸分类。在NMR数据集的大小相当的情况下,有46小时的音频,我们在20级问题上达到了97.7%的分类性能。与现有的基于NMR的模型相比,我们还通过使用声学嵌入模型来实现23%的相对改善。

Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源