论文标题
Siteferret:超出蛋白质中简单的口袋识别
SiteFerret: beyond simple pocket identification in proteins
论文作者
论文摘要
我们提出了一种自动检测蛋白质分子表面口袋的新方法。该算法基于从NanoShaper软件产生的几何原始物获得的虚拟SES探针球的临时分层聚类。推定口袋的最终排名基于隔离森林法,这是一种无监督的学习方法,最初是用于异常检测的。口袋特征的详细重要性分析提供了有关良好结合位点的几何(聚类)和化学(残基)特性的见解。该方法还将口袋分割为较小的子渠道。我们证明子渠是可靠的表示,可以更精确地指出结合位点。位点雪貂的多功能性非常出色,可以准确预测从小分子到肽和困难位点的广泛结合位点。
We present a novel method for the automatic detection of pockets on protein molecular surfaces. The algorithm is based on an ad hoc hierarchical clustering of virtual SES probe spheres obtained from the geometrical primitives generated by the NanoShaper software. The final ranking of putative pockets is based on the Isolation Forest method, an unsupervised learning approach originally developed for anomaly detection. A detailed importance analysis of pocket features provides insight on which geometrical (clustering) and chemical (residues) properties characterize a good binding site. The method also provides a segmentation of pockets into smaller subpockets. We prove that subpockets are a reliable representation that pinpoint the binding site with greater precision. Site Ferret is outstanding in its versatility, accurately predicting a wide range of binding sites, from small molecules to peptides and difficult shallow sites.