python代码中AWS最佳实践的静态分析

论文标题

python代码中AWS最佳实践的静态分析

Static Analysis for AWS Best Practices in Python Code

论文作者

Mukherjee, Rajdeep, Tripp, Omer, Liblit, Ben, Wilson, Michael

论文摘要

Amazon Web Services（AWS）是一家全面且广泛采用的云提供商，提供了200多个完整的服务，包括计算，数据库，存储，网络和内容交付，机器学习，物联网等。 AWS SDK通过API端点提供对AWS服务的访问。但是，这些API的使用不正确会导致代码缺陷，崩溃，性能问题和其他问题。本文介绍了自动化静态分析规则，该规则是在商业服务中开发的，用于检测代码缺陷和安全漏洞，以识别使用使用AWS SDK的Python应用程序中AWS最佳实践的偏差。此类应用程序将AWS SDK用于Python，称为“ Boto3”，访问AWS Cloud Services。但是，对使用Cloud SDK的Python应用程序的精确静态分析需要强大的类型推理来推断云服务客户端的类型。 BOTO3 API的动态风格对类型分辨率构成了独特的挑战，在实践中使用服务客户端的分解风格也是如此。为了支持我们的最佳实践目标，我们提出了类型推理的分层策略，该策略以分阶段的方式结合了多个类型分辨率和跟踪策略。从使用AWS SDK进行> 3,000个流行的Python GitHub存储库进行的实验，我们的分层推理系统可在Python客户端代码中推断Boto3客户端时达到85％的精度和100％的回忆。此外，我们提供了八个AWS最佳实践规则的代表性样本，该样本检测到分页，投票和批处理操作在内的广泛问题。我们已经根据实际开发人员反馈评估了这些规则的功效。开发人员已经接受了八个Python规则中五个建议的建议中的85％以上，几乎所有建议的建议中有83％。

Amazon Web Services (AWS) is a comprehensive and broadly adopted cloud provider, offering over 200 fully featured services, including compute, database, storage, networking and content delivery, machine learning, Internet of Things and many others. AWS SDKs provide access to AWS services through API endpoints. However, incorrect use of these APIs can lead to code defects, crashes, performance issues, and other problems. This paper presents automated static analysis rules, developed in the context of a commercial service for detection of code defects and security vulnerabilities, to identify deviations from AWS best practices in Python applications that use the AWS SDK. Such applications use the AWS SDK for Python, called "Boto3", to access AWS cloud services. However, precise static analysis of Python applications that use cloud SDKs requires robust type inference for inferring the types of cloud service clients. The dynamic style of Boto3 APIs poses unique challenges for type resolution, as does the interprocedural style in which service clients are used in practice. In support of our best-practices goal, we present a layered strategy for type inference that combines multiple type-resolution and tracking strategies in a staged manner. From our experiments across >3,000 popular Python GitHub repos that make use of the AWS SDK, our layered type inference system achieves 85% precision and 100% recall in inferring Boto3 clients in Python client code. Additionally, we present a representative sample of eight AWS best-practice rules that detect a wide range of issues including pagination, polling, and batch operations. We have assessed the efficacy of these rules based on real-world developer feedback. Developers have accepted more than 85% of the recommendations made by five out of eight Python rules, and almost 83% of all recommendations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题