AWS RDS snapshots using T4K hooks

How to backup/restore AWS RDS using hooks

1. Backup/Restore of Application with external DB (AWS RDS)

Objective - Support backup of external databases (AWS RDS, Google Cloud SQL to begin with)

Summary - This use case is to support the backup/restore of external cloud database configured with applications running on k8s cluster.

Pre-requisite -

  1. Applications on the k8s cluster are configured to use Cloud Database managed service

  2. All the required details for the cloud database connectivity are available such as database instance name, credentials

Assumptions -

  1. Applications on the k8s cluster are configured with specific Database instance

  2. User provides all the information required for Database backup (on-demand)

  3. All the requirements for database backup are provided such as user access for backup, zones/regions allowed for backups, allotted space, Database instance name/identifier, location etc.

  4. There is no additional mapping required post restore of the Database i.e. application should be able to connect to the Database using same credentials, instance name etc.

Workflow -

  1. User provides inputs for a. List of applications configured to use external database b. The Database instance name/identifier c. User credentials for backup d. Zone/region allowed for backup e. Location for backup

  2. A “jump” pod is needed with Cloud and kubectl libraries which are required to connect to remote Cloud databases for backup/restore. This “jump” pod is created in the same namespace where the application to be backed up is running.

  3. As per the policy created for the application backup, the application backup is triggered.

  4. Along with application backup, an on-demand backup/snapshot will be created for the external database using API calls. This will be achieved by Backup hooks.

  5. Snapshot identified of the Cloud Database is constructed using BackupPlan and Backup names. This will be a unique identifier and the user will always have access to these details. Also, this will help maintain a link between the T4K backup and the Database snapshot.

  6. For restore, user provides inputs for a. BackupPlan and Backup name for restore b. The Database instance name for restore c. User credentials for restore d. Zone/region allowed for restore e. VPC Security Group

  7. At the time of restore of the application, the backup/snapshot will be restored using the name/identifier. This will be achieved by Restore hooks.

  8. Post restore, the application connectivity with the external database will be established based on the database instance name/identifier and the user provided credentials.

Example with steps Followed -

In AWS RDS, create a database instance using MySQL Engine. A free tier instance can be selected. Create a new VPC Security Group. Update inbound and outbound rules to allow access.

Once the instance is created, note down the Username, Password and Endpoint which are needed to connect to the DB instance. Also, create a database for wordpress application in the Database instance.

Deploy wordpress application with AWS RDS database configured (replace host url, admin password, DB name)

Check the AWS RDS Database is the wordpress tables are populated.

Backup hooks are used to create a snapshot of the AWS RDS database snapshot. To create the RDS database snapshot, a pod with aws cli needed in the namespace. Using AWS cli, we can connect to RDS to create a snapshot.

Create the “jump” pod in the same namespace where wordpress is installed. Use the script below. It uses an image which has kubectl, aws cli installed. We need to pass on aws credentials and kubeconfig to this image. Aws credentials and region are configured using “aws configure”.

Create a backup hook as per yaml below using UI or CLI. In the “post” command of the hook, a script is provided which identifies a running backup and corresponding backupPlan, then creates the snapshot of the DB instance (--db-instance-identifier to be provided by the user) with snapshot identifier as “tvk-backupName-backupPlanName”. Please replace <DB Instance Identifier> with actual value below.

Create a backupPlan using UI or CLI. Please specify the jump pod created as step 6 in pod selector for hook execution.

Create a namespace backup where the wordpress application is installed using CLI or UI. This will also create a snapshot of the RDS database instance.

As part of the backup, a snapshot of the RDS database is taken. Same can be checked using below command with a query.

For Restore, create a restore hook as per the yaml file given below. We need user inputs for

  1. Database Instance name for restore (DB_INSTANCE_ID=<DB Instance Identifier>)

  2. VPC Security group to be used (VPC_SEC_GROUP_=_<VPC-SEC-GROUP>)

  3. Name of the backup (BACKUP_NAME="demo-backup")

  4. Name of the Backup Plan (BACKUPPLAN_NAME="wp-hook-bkpplan")

  5. Based on the backup and backup plan name, SNAPSHOT ID is formed by concatenating these 2 strings.

  6. Also note that additional maxRetryCount and timeoutSeconds are added for the “post” action as it may take more time.

Replace actual values below for "DB Instance Identifier" and "VPC Sec Group"

If we are restoring the RDS database with the same name, no changes are needed in the restored application using the RDS database as there are no changes in the RDS Database endpoint and credentials. However, if a different RDS database name is used for restore, the restore application using the RDS database needs to be updated accordingly.

Create a restore using CLI or UI.

As part of the restore, the RDS Database instance is created from the snapshot. The wordpress application is up/running in the restore namespace.

This completes the backup and restore of a stateful application which uses external database like AWS RDS using Trilio for Kubernetes.

2. AWS RDS snapshot cleanup upon backup deletion

Objective - Provide tool for deletion of snapshots of external databases (AWS RDS, Google Cloud SQL to begin with)

Summary - This use case is to provide a tool to delete snapshot of external cloud database upon deletion of T4K backup linked to the cluster.

Pre-requisite -

  1. Applications on the k8s cluster are configured to use Cloud Database managed service

  2. All the required details for the cloud database connectivity are available such as database instance name, credentials

  3. k8s cluster with T4K installed and target configured. Based on the backups on the target, the snapshots will be retained/deleted.

Assumptions -

  1. Applications on the k8s cluster are configured with specific Database instance

  2. User provides all the information required for Database snapshot deletion (on-demand)

  3. All the requirements for database snapshot deletion are provided such as user access for backup, zones/regions allowed for backups, allotted space, Database instance name/identifier, location etc.

  4. The T4K backup target configured on the k8s cluster is referred for comparing the available backups against the snapshots of the Cloud database instances.

Workflow -

  1. User provides inputs for a. List of applications configured to use external database b. The Database instance name/identifier c. User credentials for backup, snapshot deletion d. Zone/region allowed for backup, snapshot deletion e. Location for backup

  2. The user inputs/credentials are stored in a configmap or secret

  3. The T4K is installed and backup target is configured.

  4. A cronjob is created using an image which has access to AWS RDS and k8s cluster. It has kubectl, awscli tools and libraries installed.

  5. The cronjob will periodically check the backups available on the backup target and compare it with the snapshots of the external cloud database. If the backup is deleted from the backup target, the cronjob will delete the corresponding snapshot from the external cloud database.

  6. The cronjob is launched "on-demand" by the user.

Example with steps Followed -

Create a configmap with a script to check backups in a user specified namespace. Refer the yaml below. Please replace <DB Instance Identifier> with actual value.

Create a cronjob to run the script with schedule for the execution. It uses an image which has kubectl, aws cli installed. We need to pass on aws credentials and kubeconfig to this image. Aws credentials and region are configured using “aws configure”. The schedule can be changed as per the frequency needed to cleanup the snapshots. Refer the yaml below

Steps and output of the commands

Last updated

Was this helpful?