论文标题
一项探索性研究,可以从软件遗产数据集中找到跨平台叉子背后的动机
An Exploratory Study to Find Motives Behind Cross-platform Forks from Software Heritage Dataset
论文作者
论文摘要
基于叉子的开发机制为软件团队提供了灵活性和统一的流程,以便在分布式设置中轻松协作而没有太多的协调开销。目前,多个社交编码平台支持基于叉子的开发,例如Github,Gitlab和Bitbucket。尽管这些不同的平台实际上具有相同的功能,但它们具有不同的重点。由于GitHub是最受欢迎的平台,并且相应的数据已公开可用,因此当前的大多数研究都集中在GitHub托管项目上。但是,我们观察到轶事证据表明,人们对这些平台之间的选择感到困惑,并且某些项目正在从一个平台迁移到另一个平台,而这些活动背后的原因尚不清楚。随着软件遗产图数据集(SWHGD)的进步,我们有机会调查跨平台的分支活动。在本文中,我们对10个受欢迎的开源项目进行了探索性研究,以识别跨平台叉并调查背后的动机。初步结果表明确实存在跨平台叉。对于本研究的10个主题系统,我们发现了81,357叉,其中179叉在Gitlab上。基于我们的定性分析,我们发现我们确定的大多数跨平台叉是另一个平台上存储库的镜像,但是由于偏爱使用某些功能(例如,连续集成(CI)),我们仍然发现创建的案例是由不同平台支持的。这项研究奠定了未来研究方向的基础,例如了解平台之间的差异和支持跨平台协作。
The fork-based development mechanism provides the flexibility and the unified processes for software teams to collaborate easily in a distributed setting without too much coordination overhead.Currently, multiple social coding platforms support fork-based development, such as GitHub, GitLab, and Bitbucket. Although these different platforms virtually share the same features, they have different emphasis. As GitHub is the most popular platform and the corresponding data is publicly available, most of the current studies are focusing on GitHub hosted projects. However, we observed anecdote evidences that people are confused about choosing among these platforms, and some projects are migrating from one platform to another, and the reasons behind these activities remain unknown.With the advances of Software Heritage Graph Dataset (SWHGD),we have the opportunity to investigate the forking activities across platforms. In this paper, we conduct an exploratory study on 10popular open-source projects to identify cross-platform forks and investigate the motivation behind. Preliminary result shows that cross-platform forks do exist. For the 10 subject systems in this study, we found 81,357 forks in total among which 179 forks are on GitLab. Based on our qualitative analysis, we found that most of the cross-platform forks that we identified are mirrors of the repositories on another platform, but we still find cases that were created due to preference of using certain functionalities (e.g. Continuous Integration (CI)) supported by different platforms. This study lays the foundation of future research directions, such as understanding the differences between platforms and supporting cross-platform collaboration.