avgust：从应用执行的视频中自动化基于用法的测试生成

论文标题

avgust：从应用执行的视频中自动化基于用法的测试生成

Avgust: Automating Usage-Based Test Generation from Videos of App Executions

论文作者

Zhao, Yixue, Talebipour, Saghar, Baral, Kesina, Park, Hyojae, Yee, Leon, Khan, Safwat Ali, Brun, Yuriy, Medvidovic, Nenad, Moran, Kevin

论文摘要

为移动应用程序编写和维护UI测试是一项耗时且繁琐的任务。尽管数十年的研究为UI测试生成提供了自动化方法，但这些方法通常集中于测试崩溃或最大化代码覆盖范围。相比之下，最近的研究表明，开发人员更喜欢基于用法的测试，这些测试围绕应用程序功能的特定用途，以帮助支持回归测试等活动。很少有现有技术支持这样的测试，因此需要自动化理解UI屏幕和用户输入语义的艰巨任务。在本文中，我们介绍了Avgust，该文章自动化了生成基于用法测试的关键步骤。 Avgust使用神经模型来理解图像来处理应用程序用途的视频记录，以综合这些用途的应用程序状态机器编码。然后，Avgust使用此编码来合成新目标应用程序的测试用例。我们评估了374个关于18个流行应用程序的常见用途的视频，并表明69％的测试会成功地执行所需的用法，并且Avgust的分类器优于最新情况。

Writing and maintaining UI tests for mobile apps is a time-consuming and tedious task. While decades of research have produced automated approaches for UI test generation, these approaches typically focus on testing for crashes or maximizing code coverage. By contrast, recent research has shown that developers prefer usage-based tests, which center around specific uses of app features, to help support activities such as regression testing. Very few existing techniques support the generation of such tests, as doing so requires automating the difficult task of understanding the semantics of UI screens and user inputs. In this paper, we introduce Avgust, which automates key steps of generating usage-based tests. Avgust uses neural models for image understanding to process video recordings of app uses to synthesize an app-agnostic state-machine encoding of those uses. Then, Avgust uses this encoding to synthesize test cases for a new target app. We evaluate Avgust on 374 videos of common uses of 18 popular apps and show that 69% of the tests Avgust generates successfully execute the desired usage, and that Avgust's classifiers outperform the state of the art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题