论文标题
数据集营养标签(第二代):利用上下文来减轻人工智能的危害
The Dataset Nutrition Label (2nd Gen): Leveraging Context to Mitigate Harms in Artificial Intelligence
论文作者
论文摘要
随着生产数据集生产自动决策系统(AD)的生产和依赖的增加,对评估和询问基础数据的过程的需求也增加了。在2018年启动数据集营养标签后,数据营养项目已对标签的设计和目的进行了重大更新,并在2020年末启动了更新的标签,该标签已在本文中预览。新标签包括针对数据科学家配置文件的更新设计和用户界面提出的特定于上下文用例和警报。本文讨论了标签旨在减轻标签的基础培训数据所带来的危害和偏见,工作的当前状态,包括被标记的新数据集,新的和现有的挑战以及工作的进一步指示以及预览新标签的数据。
As the production of and reliance on datasets to produce automated decision-making systems (ADS) increases, so does the need for processes for evaluating and interrogating the underlying data. After launching the Dataset Nutrition Label in 2018, the Data Nutrition Project has made significant updates to the design and purpose of the Label, and is launching an updated Label in late 2020, which is previewed in this paper. The new Label includes context-specific Use Cases &Alerts presented through an updated design and user interface targeted towards the data scientist profile. This paper discusses the harm and bias from underlying training data that the Label is intended to mitigate, the current state of the work including new datasets being labeled, new and existing challenges, and further directions of the work, as well as Figures previewing the new label.