It is recommended to think about the following elements prior to the installation of Trilio for Openstack.
Trilio uses Cinder snapshots for calculating full and incremental backups. For full backups, Trilio creates Cinder snapshots for all the volumes in the backup job. It then leaves these Cinder snapshots behind for calculating the incremental backup image during next backup. During an incremental backup operation it creates new Cinder snapshots, calculates the changed blocks between the new snapshots and the old snapshots that were left behind during full/previous backups. It then deletes the old snapshots but leaves the newly created snapshots behind. So, it is important that each tenant who is availing Trilio backup functionality has sufficient Cinder snapshot quotas to accommodate these additional snapshots. The guideline is to add 2 snapshots for every volume that is added to backups to volume snapshot quotas for that tenant. You may also increase the volume quotas for the tenant by the same amount because Trilio briefly creates a volume from snapshot to read data from the snapshot for backup purposes. During a restore process, Trilio creates additional instances and Cinder volumes. To accommodate restore operations, a tenant should have sufficient quota for Nova instances and Cinder volumes. Otherwise restore operations will result in failures.
AWS S3 object consistency model includes:
Read-after-write
Read-after-update
Read-after-delete
Each of them describes how an object will reach its consistent state after an object is created/updated or deleted. None of them provides strong consistency and there is a lag time for an object to reach the consistent state. Though Trilio employed mechanisms to work around the limitations of eventual consistency of AWS S3, when an object reach its consistency state is not deterministic. There is no official statement from AWS on how long it takes for an object to reach consistent state. However read-after-write has a shorter time to reach consistency compared to other IO patterns. Our solution is designed to maximize read-after-write IO pattern. The time in which an object reaches eventual consistency also depends on the AWS region. For example, aws-standard region does not have strong consistency model compared to us-east or us-west. We suggest to use these regions when creating s3 buckets for Trilio. Though read-after-update IO pattern is hard to avoid completely, we employed ample delays in accessing objects to accommodate larger durations for objects to get into consistent state. However in rare occasions, backups may still fail and need to restarted.
Trilio can be deployed as a single node or a three node cluster. It is highly recommended that Trilio is deployed as three node cluster for fault tolerance and load balancing. Starting with 3.0 release, Trilio requires additional IP for cluster and is required for both single node and three node deployments. Cluster ip a.k.a virtual ip is used for managing cluster and is used to register Trilio service endpoint in the keystone sevice catalog.