Authors: Huansheng Chen (Testing Engineer at PingCAP), Yuying Song (Testing Engineer at PingCAP)
Many organizations today are looking for scale-out applications to meet their growing businesses. A popular approach to that is Arm® architecture, which is known for delivering excellent performance with a nice price to performance ratio. In 2020, Amazon Elastic Kubernetes Service (EKS) on AWS Graviton2 was made generally available. Amazon EKS provides the flexibility to start, run, and scale Kubernetes applications in AWS. Earlier this year, Microsoft announced Azure VMs with Ampere Altra Arm-based processors. Just recently on July 13, 2021, Google Cloud announced the preview o first VM family based on the Arm architecture, the Tau T2A, for scale-out, cloud-native workloads.
As the team behind TiDB, a cloud-native distributed SQL database that is ACID-compliant and strongly consistent, we are especially interested in price-performance ratio when running TiDB on Google ARM 64 and AWS Arm 64. Therefore, we conducted a quick benchmarking test.
Benchmarking
Methodology
Our test used the industry standard OLTP benchmarks sysbench. Sysbench is a well-established tool that runs synthetic benchmarks of MySQL and the hardware it runs on. Since TiDB is MySQL compatible, sysbench will be a good reference.
Testing environment
In our benchmark, we are deploying TiDB on AWS Graviton 2 and GCP tau T2A. The detailed topology and software information are listed below.
Topology
The diagram below shows the typical topology of the TiDB cluster.
Architecture Diagram of TiDB cluster
TiDB cluster has four main components: TiDB server, TiKV server, PD server and TiFlash Server.
TiDB server is the stateless SQL layer that’s compatible with MySQL. It does not store data and is only for computing and SQL analyzing, transmitting actual data read requests to TiKV nodes. That’s why we choose the c6g.2xlarge EC2 instances on AWS – they are compute optimized machines. For GCP, Tau T2A is the first Compute Engine VM to run on ARM. We have no other choice but use t2a-standard-8.
TiKV server is responsible for storing data. TiKV is a distributed transactional key-value storage engine. Data is distributed across all the TiKV nodes. Since TiKV has a large number of data processing operations (like table scan) which need to cache the data in memory, we select memory optimized instances r6g.2xlarge on AWS. We use Graviton 2 instead of Graviton 3, since currently only the compute optimized C7 series is ready for Graviton3, the memory optimized R series is not ready yet.
PD server manages the cluster’s metadata. It stores the metadata of real-time data distribution of every TiKV node and the topology structure of the entire TiDB cluster. It uses minimal computing resources; therefore we use c6g.large on AWS and t2a-standard-4 on GCP.
TiFlash Server is a columnar storage extension of TiKV that provides both good isolation level and strong consistency guarantee. As Sysbench is a pure OLTP workload, TiFlash is not deployed in this quick test.
The tests used two clusters with the following processing types and configurations. To avoid extra cost from network transfer, all the components are deployed within the sam availability zone:
Cluster | Service type | VM type | vCPU,Mem | #Instance | Storage type | TiKV IOPS |
Storage (GB) | TiKV Throughput (MBps) |
8c GCP ARM | TiDB | t2a-standard-8 | 8c, 32g | 3 | SSD Persistent Disk | – | 50 | – |
TiKV | t2a-standard-8 | 8c, 32g | 3 | SSD Persistent Disk | – | 500 | – | |
PD | t2a-standard-4 | 4c, 16g | 1 | SSD Persistent Disk | – | 50 | – | |
8c AWS ARM | TiDB | c6g.2xlarge | 8c, 16g | 3 | GP3 | – | – | – |
TiKV | r6g.2xlarge | 8c, 64g | 3 | GP3 | 4000 | 500 | 288 | |
PD | c6g.large | 4c, 8g | 1 | GP2 | – | 50 | – |
Software version
The software versions of the TiDB cluster and the benchmarking tools are listed below:
Service type | Software version |
TiDB | v6.1.0 |
TiKV | v6.1.0 |
PD | v6.1.0 |
sysbench | 1.1.0-df89d34 |
Cost
The table below summarizes the cost breakdown per component and in total:
GCP Tau Arm (us-central Iowa)
AWS Graviton2 Arm (US west Oregon)
The total hourly cost of each TiDB cluster is close. Here, the cost of data transfer across multiple Availability Zones is not taken into consideration.
Sysbench benchmark
Sysbench is one of the most popular open-source benchmark tools to test database systems. It provides statistics including workload, queries per second (QPS), transactions per second (TPS), and latency.
The read/write workload split information is listed below.
Workload
- 3 workload types are conducted.
- OLTP_READ_WRITE
- OLTP_READ_ONLY
- OLTP_WRITE_ONLY
- Threads: 100
- Tables: 16
- Table size: 10 M rows per table
- Data size: ~40 GB
Benchmark results
Following table summarizes the performance (TPS), cost, and price-performance for GCP Tau and AWS Graviton. The price-performance ratio is written as Throughput/Cost, which is TPS/Cost in this case.
AWS Graviton showed 21.7% higher TPS in oltp_read_only and 6.04% higher TPS oltp_read_write workload. While GCP Tau VM show 15.09% higher TPS in oltp_write_only workload.
The total system cost reflects the estimated five-year hardware cost.
The value is derived from the average TPS among 100 threads.
Price-performance compares GCP Tau and AWS Graviton2 Arm processors. A lower number is better. It indicates a lower cost for more performance.
Price-performance Comparison
Sysbench Price-performance Ratio
In the read-only workload, AWS Graviton showed better performance. After factoring the compute resource cost, AWS Graviton outperforms GCP Tau VM by 16.37% in oltp_read_only workload. In the write-only workload, GCP Tau VM shows better performance; GCP Tau VM outperforms AWS Graviton by 16.91% in oltp_write_only. In this test, the combination of GCP Tau T2A VM + SSD Persistent Disk provides better IO write performance than AWS Graviton 2 + GP3 in the write-only workload; while AWS memory optimized instances (r6g.2xlarge) provide better read performance in the read-only workload. The mixed workload turned out to end in a draw.
Conclusion
Benchmarking results from sysbench show that the AWS Graviton outperforms GCP Tau VM for TiDB workload by 16.37% in oltp_read_only workload. However, GCP Tau VM outperforms AWS Graviton by 16.91% in oltp_write_only. Both of them provide similar performance in oltp_read_write workload.
One of the limitations of our testing is not using the latest AWS Graviton 3. The price-performance ratio may be further improved for Graviton 3. In addition, the GCP Tau is still in the preview release. We will post further results when updated versions are available.
We will also do a benchmarking on GCP X86 and compare the price-performance ratio between Google Arm 64 and GCP X86. Stay tuned!
Keep reading:
How I Found a Go Issue on ARM that Crashed the Database Server
TiDB on Arm-based Kubernetes Cluster Achieves Up to 25% Better Price-Performance Ratio than x86
TiDB Cloud Dedicated
TiDB Cloudのエンタープライズ版。
専用VPC上に構築された専有DBaaSでAWSとGoogle Cloudで利用可能。
TiDB Cloud Serverless
TiDB Cloudのライト版。
TiDBの機能をフルマネージド環境で使用でき無料かつお客様の裁量で利用開始。