Search…
TVK S3 Fuse Plugin performance
This page describes how to measure the overhead from the TVKs S3 Plugin

Measuring TrilioVault S3 FUSE Plugin overhead

Everything being equal, there needs to be a way to measure the overhead of TVKs FUSE implementation. Since TVK is a software-only solution, we cannot measure FUSE implementation overhead in absolute numbers. The overhead depends on various factors, so we need to establish a baseline first.
In our experiment, we provisioned two identical servers, both running CentOS. We name these serverscompute1 and compute2. A dedicated 1 GB network connects the two servers to minimize interference from other traffic. compute1 is the client and compute2 is the server. A server in our context runs an s3 endpoint and an NFS endpoint. MinIO, perhaps the most easy-to-use s3 implementation, can be launched with a simple command, and hence we will useminioas our object-store.
We provision two directories on the same disk oncompute2. /mnt/nfs_share is exported as the NFS share and/mnt/miniodatais used forminioservice for storing all objects. compute1 mounts /mnt/nfs_share and also runs the S3 FUSE plugin and provides a mount point /mnt/miniomnt. Our objective is to measure our S3 FUSE overhead w.r.t to NFS and AWS S3 API.
As part of the experiment, we create a 100GB file. 100 GB is big enough to smooth out the fixed costs associated with protocols. Moreover, most of the backup images tend to be large, and 100 GB is a good data sampling for this experiment.
First, we copy 100GB to NFS share. The throughput we achieved is 2G/min.
1
(mypython) [[email protected] mnt]$ time cp 100GB /mnt/nfs_shares/
2
3
real 50m37.017s
4
user 0m0.462s
5
sys 2m54.617s
Copied!
When measuring AWS S3 performance, we should avoid using multipart upload. Multipart upload performs well for large files. However, TVK FUSE plugin chunks files and manages these chunks as objects in the FUSE plugin. So we split the 100GB file into 32 MB chunks using the Linuxsplitcommand and copy individual segments into thesegmentsdirectory. We then upload the segments directory using theaws s3 cpcommand as shown below. The throughput we achieved here is 1.42GB/min, which means a 30% overhead compared to NFS results.
1
(mypython) [[email protected] mnt]$ time AWS s3 cp --recursive segments/ s3://trilio/segments
2
3
real 70m24.839s
4
user 14m15.828s
5
sys 9m8.994s
Copied!
Next, we copy the 100GB file to the S3 FUSE mount. The throughput achieved is very similar to theaws s3 cpcommand. Hence we can confidently conclude that the S3 FUSE implementation adds little to no overhead compared to pure aws api.
1
(mypython) [[email protected] mnt]$ time cp 100GB /mnt/miniomnt/
2
3
real 71m16.633s
4
user 0m2.131s
5
sys 2m16.475s
Copied!
The TrilioVault backup images are in QCOW2 format and the process of creating QCOW2 images is slightly different than copying a file to S3 FUSE mount. So it is prudent to measure theqemu-img convertperformance on the S3 FUSE mount. The throughput for generating QCOW2 is almost the same as copying a file to s3 FUSE mount.
1
(mypython) [[email protected] mnt]$ time qemu-img convert -p -O qcow2 100GB /home/kolla/miniomnt/100GB.qcow2
2
(100.00/100%)
3
4
real 72m36.022s
5
user 0m15.282s
6
sys 3m14.472s
Copied!
In conclusion, AWS S3 API performance is subjective and varies from one object store implementation to other. The performance depends on various factors, including the number of disks, types of disks, raw processing power, replication factor, consistency, etc. TrilioVault S3 FUSE implementation adds little to no overhead when compared to core AWS S3 API calls. As a result, TrilioVault performs backups at wire transfer.
Last modified 5mo ago
Copy link