论文标题

开源软件开发挑战:GitHub的系统文献综述

Open Source Software Development Challenges: A Systematic Literature Review on GitHub

论文作者

Şeker, Abdulkadir, Diri, Banu, Arslan, Halil, Amasyalı, Mehmet Fatih

论文摘要

Git用作许多开源软件项目的分布式版本控制系统。一项基于GIT的服务GitHub是开源软件项目最常见的代码托管和存储库服务。对于研究软件工程的研究人员,这些平台上托管的内容提供了很多有价值的数据。有一些替代方法可以获取GITHUB数据,例如GitHub存档,GitHub API或Ghtorrent。在这些选项中,Ghtorrent是文献中最广为人知和使用的GitHub数据集。尽管有一些有关在GitHub平台上对软件工程挑战的综述研究,但没有对Ghtorrent数据集特异性研究的审查。在这项研究中,对使用Ghtorrent作为数据源的172项研究进行了分类,并在开源软件开发挑战的范围内进行了分类和系统的文献综述。此外,已经指出了数据集的利弊,并注意到了文献的重点问题和公开挑战。

Git is used as the distributed version control system for many open-source software projects. One Git-based service, GitHub, is the most common code hosting and repository service for open-source software projects. For researchers that study software engineering, the content that is hosted on these platforms provides much valuable data. There are some alternatives to get GitHub data such as GitHub Archive, GitHub API or GHTorrent. Among these options, GHTorrent is the most widely known and used GitHub dataset in the literature. Although there are some review studies about software engineering challenges across the GitHub platform, no review of GHTorrent dataset-specific research is available. In this study, the 172 studies that use GHTorrent as a data source were categorized within the scope of open source software development challenges and a systematic literature review was carried out. Moreover, the pros and cons of the dataset have been indicated and the focused issues of the literature on and the open challenges have been noted.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源