论文标题
使用平台Crumbs进行数据科学:对YouTube上假货观看的调查
Doing data science with platforms crumbs: an investigation into fakes views on YouTube
论文作者
论文摘要
本文有助于关于对社交媒体数据的学术访问的持续讨论,并讨论了一个案例,尽管该案例对理解和反对在线虚假信息的价值以及没有隐私或版权问题,但仍被禁止这种访问权限。我们的研究涉及YouTube的参与度指标,更具体地说,平台删除“假视图”的方式(即,平台被认为是人为的或非法的视图)。使用从一千个法国YouTube频道中提取的一个半年的数据,我们显示了这种现象的大量范围,这涉及大多数渠道和我们语料库中一半以上的视频。我们的分析表明,大多数假装新闻在视频的生活中相对较晚,并且视频的最终观点计数并非与他们收到的假观点无关。我们讨论了延误在矫正中可能造成的潜在危害:通过夸大观点的数量,非法观点可能会使视频看起来比它更受欢迎,并无所适从地鼓励其人类和算法的建议。不幸的是,我们无法对此现象提供明确的评估,因为YouTube在其API或界面中没有提供有关虚假视图的信息。因此,本文也呼吁YouTube和其他在线平台提高有关信息的透明度,这可能对在线公开辩论的质量产生至关重要的影响。
This paper contributes to the ongoing discussions on the scholarly access to social media data, discussing a case where this access is barred despite its value for understanding and countering online disinformation and despite the absence of privacy or copyright issues. Our study concerns YouTube's engagement metrics and, more specifically, the way in which the platform removes "fake views" (i.e., views considered as artificial or illegitimate by the platform). Working with one and a half year of data extracted from a thousand French YouTube channels, we show the massive extent of this phenomenon, which concerns the large majority of the channels and more than half the videos in our corpus. Our analysis indicates that most fakes news are corrected relatively late in the life of the videos and that the final view counts of the videos are not independent from the fake views they received. We discuss the potential harm that delays in corrections could produce in content diffusion: by inflating views counts, illegitimate views could make a video appear more popular than it is and unwarrantedly encourage its human and algorithmic recommendation. Unfortunately, we cannot offer a definitive assessment of this phenomenon, because YouTube provides no information on fake views in its API or interface. This paper is, therefore, also a call for greater transparency by YouTube and other online platforms about information that can have crucial implications for the quality of online public debate.