论文标题
电子商务中的黑暗模式:数据集及其基线评估
Dark patterns in e-commerce: a dataset and its baseline evaluations
论文作者
论文摘要
黑暗图案是在线服务中用户界面设计的,诱使用户采取意想不到的动作。最近,黑暗的模式被提出为隐私和公平问题。因此,人们热切期待有关检测黑模式的广泛研究。在这项工作中,我们构建了一个用于黑模式检测的数据集,并通过最先进的机器学习方法准备了基线检测性能。最初的数据集是从2019年的Mathur等人的研究中获得的,该研究由购物网站的1,818个深色模式文本组成。然后,我们通过与Mathur等人的数据集从相同的网站中检索文本,添加了负样本,即非黑暗模式文本。我们还应用了最先进的机器学习方法来显示自动检测准确性,包括Bert,Roberta,Albert和XLNET。由于交叉验证5倍,我们在罗伯塔(Roberta)获得了0.975的最高精度。数据集和基线源代码可在https://github.com/yamanalab/ec-darkpattern上找到。
Dark patterns, which are user interface designs in online services, induce users to take unintended actions. Recently, dark patterns have been raised as an issue of privacy and fairness. Thus, a wide range of research on detecting dark patterns is eagerly awaited. In this work, we constructed a dataset for dark pattern detection and prepared its baseline detection performance with state-of-the-art machine learning methods. The original dataset was obtained from Mathur et al.'s study in 2019, which consists of 1,818 dark pattern texts from shopping sites. Then, we added negative samples, i.e., non-dark pattern texts, by retrieving texts from the same websites as Mathur et al.'s dataset. We also applied state-of-the-art machine learning methods to show the automatic detection accuracy as baselines, including BERT, RoBERTa, ALBERT, and XLNet. As a result of 5-fold cross-validation, we achieved the highest accuracy of 0.975 with RoBERTa. The dataset and baseline source codes are available at https://github.com/yamanalab/ec-darkpattern.