论文标题

流行DBMS中复制的网络足迹

The network footprint of replication in popular DBMSs

论文作者

Shehzad, Muhammad Karam, Yousif, Jam Muhammad, Ilyas, Muhammad Saqib, Iqbal, Adnan

论文摘要

数据库复制是可靠,容忍和高度可用的分布式系统的重要组成部分。但是,数据复制还会导致通信和处理开销。这些间接费用的量化对于选择合适的DBM的几个可用选项和容量计划至关重要。在本文中,我们通过对三种常用DBMS的复制活动进行比较的经验分析(MySQL,MySQL,PostgreSQL和Cassandra)的比较经验分析,以及图像流量。在我们的实验中,具有两个复制品的总流量(这是常态)比没有复制品的总流量高300美元。此外,内置在MySQL中的复制流量的压缩选项的激活将总网络流量降低了多达20美元。我们还发现,平均CPU利用率和内存利用不受复制品或数据集数量的影响。

Database replication is an important component of reliable, disaster tolerant and highly available distributed systems. However, data replication also causes communication and processing overhead. Quantification of these overheads is crucial in choosing a suitable DBMS form several available options and capacity planning. In this paper, we present results from a comparative empirical analysis of replication activities of three commonly used DBMSs - MySQL, PostgreSQL and Cassandra under text as well as image traffic. In our experiments, the total traffic with two replicas (which is the norm) was as much as $300$\% higher than the total traffic with no replica. Furthermore, activation of the compression option for replication traffic, built into MySQL, reduced the total network traffic by as much as $20$\%. We also found that average CPU utilization and memory utilization were not impacted by the number of replicas or the dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源