使用多任务深神经网和对比分子解释的准确临床毒性预测

论文标题

使用多任务深神经网和对比分子解释的准确临床毒性预测

Accurate Clinical Toxicity Prediction using Multi-task Deep Neural Nets and Contrastive Molecular Explanations

论文作者

Sharma, Bhanushee, Chenthamarakshan, Vijil, Dhurandhar, Amit, Pereira, Shiranee, Hendler, James A., Dordick, Jonathan S., Das, Payel

论文摘要

可解释的分子毒性预测的ML是有效的药物开发和化学安全的有前途的方法。预测性的ML毒性模型可以减少实验成本和时间，同时通过显着减少动物和临床测试来减轻道德问题。在此，我们使用深度学习框架在体外，体内和临床毒性数据中同时建模。使用了两种不同的分子输入表示：摩根指纹和预训练的微笑嵌入。多任务深度学习模型准确地预测了所有终点（包括临床）的毒性，如AUROC和平衡的精度所示。特别是，与分子基准中现有模型相比，微笑嵌入作为多任务模型的输入改善了临床毒性预测。此外，我们的多任务方法是全面的，因为它可以与体外，体内和临床平台中特定端点的最新方法相媲美。通过多任务模型和转移学习，我们能够表明体内数据对临床毒性预测的最小需求。为了提供信心并解释该模型的预测，我们适应了一种事后的对比解释方法，该方法返回相关的正和相关的负面特征，这些特征与已知的诱变和反应性毒理学良好相对应，例如未建立的粘合杂种异源，芳香胺，芳香胺和迈克尔受体。此外，相关特征分析通过相关特征分析恢复的毒理恢复会捕获更多的体外（53％）和体内（56％），而不是临床（8％），终点，并且确实发现已知毒性数据对体外和体内实验数据的偏爱。据我们所知，这是使用当前和缺失子结构的第一个对比解释，以预测临床和体内分子毒性。

Explainable ML for molecular toxicity prediction is a promising approach for efficient drug development and chemical safety. A predictive ML model of toxicity can reduce experimental cost and time while mitigating ethical concerns by significantly reducing animal and clinical testing. Herein, we use a deep learning framework for simultaneously modeling in vitro, in vivo, and clinical toxicity data. Two different molecular input representations are used: Morgan fingerprints and pre-training SMILES embeddings. A multi-task deep learning model accurately predicts toxicity for all endpoints, including clinical, as indicated by AUROC and balanced accuracy. In particular, SMILES embeddings as input to the multi-task model improved clinical toxicity predictions compared to existing models in MoleculeNet benchmark. Additionally, our multi-task approach is comprehensive in the sense that it is comparable to state-of-the-art approaches for specific endpoints in in vitro, in vivo and clinical platforms. Through both the multi-task model and transfer learning, we were able to indicate the minimal need of in vivo data for clinical toxicity predictions. To provide confidence and explain the model's predictions, we adapt a post-hoc contrastive explanation method that returns pertinent positive and pertinent negative features, which correspond well to known mutagenic and reactive toxicophores, such as unsubstituted bonded heteroatoms, aromatic amines, and Michael receptors. Furthermore, toxicophore recovery by pertinent feature analysis captures more of the in vitro (53%) and in vivo (56%), rather than of the clinical (8%), endpoints, and indeed uncovers a preference in known toxicophore data towards in vitro and in vivo experimental data. To our knowledge, this is the first contrastive explanation, using both present and absent substructures, for predictions of clinical and in vivo molecular toxicity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题