论文标题
建立人类共同创造以加速材料发现
Toward Human-AI Co-creation to Accelerate Material Discovery
论文作者
论文摘要
我们社会越来越需要在科学方面取得更快的进步,以解决紧急问题,例如气候变化,环境危害,可持续能源系统,大流行等。在某些领域,例如化学领域,科学发现会带来额外的负担,以评估所提出的新颖解决方案的风险,然后再移至实验阶段。尽管机器学习和AI的最新进展以应对其中一些挑战,但仍存在支持端到端发现应用程序的技术差距,将无数可用的技术集成到连贯,精心策划但灵活的发现过程中。此类应用需要大规模处理复杂的知识管理,从而为主题专家(SME)及时有效地提供知识消耗和生产。此外,新型功能材料的发现强烈依赖于化学空间中勘探策略的发展。例如,由于生成模型能够在跨物质领域产生大量新分子,因此在科学界引起了人们的关注。这些模型表现出极端的创造力,通常会以较低的候选人的生存能力转化。在这项工作中,我们提出了一个工作台框架,旨在使人类的共同创造减少时间,直到第一次发现和涉及的机会成本为止。该框架依赖于域和过程知识的知识库以及用户交互组件来获取知识并为中小企业提供建议。目前,该框架支持四个主要活动:生成建模,数据集分类,分子裁决和风险评估。
There is an increasing need in our society to achieve faster advances in Science to tackle urgent problems, such as climate changes, environmental hazards, sustainable energy systems, pandemics, among others. In certain domains like chemistry, scientific discovery carries the extra burden of assessing risks of the proposed novel solutions before moving to the experimental stage. Despite several recent advances in Machine Learning and AI to address some of these challenges, there is still a gap in technologies to support end-to-end discovery applications, integrating the myriad of available technologies into a coherent, orchestrated, yet flexible discovery process. Such applications need to handle complex knowledge management at scale, enabling knowledge consumption and production in a timely and efficient way for subject matter experts (SMEs). Furthermore, the discovery of novel functional materials strongly relies on the development of exploration strategies in the chemical space. For instance, generative models have gained attention within the scientific community due to their ability to generate enormous volumes of novel molecules across material domains. These models exhibit extreme creativity that often translates in low viability of the generated candidates. In this work, we propose a workbench framework that aims at enabling the human-AI co-creation to reduce the time until the first discovery and the opportunity costs involved. This framework relies on a knowledge base with domain and process knowledge, and user-interaction components to acquire knowledge and advise the SMEs. Currently,the framework supports four main activities: generative modeling, dataset triage, molecule adjudication, and risk assessment.