论文标题
法律意见词嵌入中的性别和种族刻板印象检测
Gender and Racial Stereotype Detection in Legal Opinion Word Embeddings
论文作者
论文摘要
研究表明,一些自然语言处理(NLP)系统对我们社会的潜在不利伦理影响进行编码并复制有害偏见。在本文中,我们提出了一种方法,以识别接受美国判例法司法意见培训的单词嵌入中的性别和种族刻板印象。当下游系统用于分类,信息提取,问答或其他用于构建法律研究工具的机器学习系统时,包含刻板印象信息的嵌入可能会造成伤害。我们首先解释了以前提出的识别这些偏见的方法不适合与接受法律意见文本培训的单词嵌入式有关。然后,我们提出了一种适应域名的域名方法,用于确定法律领域中的性别和种族偏见。我们使用这些方法的分析表明,种族和性别偏见被编码为接受法律意见训练的单词嵌入。这些偏见不会通过排除历史数据来减轻这些偏见,并且会出现在法律的多个大型主题领域中。还讨论了基于我们的观察结果使用法律意见单词嵌入和对潜在缓解策略的建议的启示。
Studies have shown that some Natural Language Processing (NLP) systems encode and replicate harmful biases with potential adverse ethical effects in our society. In this article, we propose an approach for identifying gender and racial stereotypes in word embeddings trained on judicial opinions from U.S. case law. Embeddings containing stereotype information may cause harm when used by downstream systems for classification, information extraction, question answering, or other machine learning systems used to build legal research tools. We first explain how previously proposed methods for identifying these biases are not well suited for use with word embeddings trained on legal opinion text. We then propose a domain adapted method for identifying gender and racial biases in the legal domain. Our analyses using these methods suggest that racial and gender biases are encoded into word embeddings trained on legal opinions. These biases are not mitigated by exclusion of historical data, and appear across multiple large topical areas of the law. Implications for downstream systems that use legal opinion word embeddings and suggestions for potential mitigation strategies based on our observations are also discussed.