论文标题
在NLP中重新定义公平:印度案
Re-contextualizing Fairness in NLP: The Case of India
论文作者
论文摘要
最近的研究揭示了NLP数据和模型中的不良偏见。但是,这些努力集中在西方的社会差异上,并且无法直接携带其他地理文化背景。在本文中,我们关注印度背景下的NLP公平。我们首先简要说明印度的社会差异的显着轴。我们为印度背景下的公平评估建立资源,并利用它们来证明沿某些轴的预测偏见。然后,我们深入研究了Andreligion地区的社会刻板印象,证明了其在语料库和模型中的普遍性。最后,我们概述了一项整体研究议程,以重新定下了印度背景的NLP公平研究,对印度社会环境进行了计算,弥合NLP能力和重新源性的技术差距,并适应印度文化价值。尽管我们专注于印度,但该框架可以推广到其他地理文化背景。
Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus on social disparities in West, and are not directly portable to other geo-cultural contexts. In this paper, we focus on NLP fair-ness in the context of India. We start with a brief account of the prominent axes of social disparities in India. We build resources for fairness evaluation in the Indian context and use them to demonstrate prediction biases along some of the axes. We then delve deeper into social stereotypes for Region andReligion, demonstrating its prevalence in corpora and models. Finally, we outline a holistic research agenda to re-contextualize NLP fairness research for the Indian context, ac-counting for Indian societal context, bridging technological gaps in NLP capabilities and re-sources, and adapting to Indian cultural values. While we focus on India, this framework can be generalized to other geo-cultural contexts.