Optimize T4K Backups with StormForge

This article helps users to install and configure StormForge. It can be used find the optimized resource configuration of T4K to run the backup

Introduction

Today many enterprises running Kubernetes applications are not aware of a way to monitor and optimize the resources used by those applications.

StormForge is a platform that helps users to automate resource tuning and deliver the best Kubernetes application performance at the lowest possible cost. StormForge allows users to run an experiment with a user specified number of trials. The trial runs on an application to find the optimized resource configuration required to perform certain operations. It uses machine learning algorithms to pick the combination of CPU, RAM and other resources to run a trial.

Install and configure StormForge with T4K

Follow below instructions to install and configure the StormForge with Trilio for Kubernetes and run the experiment to find the optimized resource configuration of T4K

Install and Configure StormForge redsky-controller-manager with K8s Cluster

  1. Download and install redskyctl CLI

curl -L https://app.stormforge.io/downloads/redskyctl-linux-amd64.tar.gz | tar x
sudo mv redskyctl /usr/local/bin/

2. Authorize the linux server from where redskyctl will be evoked to run the experiment

redskyctl login --force

Enter the URL generated by above command into the browser to authorize the Linux server

https://auth.carbonrelay.io/authorize?audience=https%3A%2F%2Fapi.carbonrelay.io%2Fv1%2F&client_id=pE3kMKdrMTdW4DOxQHesyAuFGNOWaEke&code_challenge=nUaqSbjv0grweFAMrv_Zk9KBg_8o-4AQAnwxpijJlGk&code_challenge_method=S256&redirect_uri=http%3A%2F%2F127.0.0.1%3A8085%2F&response_type=code&scope=register%3Aclients+offline_access&state=nHGW89zTYO39loEALGTsig

3. Verify if you are connected to the server where you want to run the experiment

kubectl get nodes
NAME                                          STATUS   ROLES               AGE    VERSION
ip-mytesthost-w1.us-east-2.compute.internal   Ready    worker              4d5h   v1.19.9
ip-mytesthost-w2.us-east-2.compute.internal   Ready    worker              4d5h   v1.19.9
ip-mytesthost-m1.us-east-2.compute.internal   Ready    controlplane,etcd   4d5h   v1.19.9
ip-mytesthost-m2.us-east-2.compute.internal   Ready    controlplane,etcd   4d5h   v1.19.9
ip-mytesthost-w3.us-east-2.compute.internal   Ready    worker              4d5h   v1.19.9
ip-mytesthost-m3.us-east-2.compute.internal   Ready    controlplane,etcd   4d5h   v1.19.9

4. Initiate the redsky controller manager pod on the kubernetes cluster

redskyctl init
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/experiments.redskyops.dev configured
customresourcedefinition.apiextensions.k8s.io/trials.redskyops.dev configured
clusterrole.rbac.authorization.k8s.io/redsky-manager-role configured
clusterrolebinding.rbac.authorization.k8s.io/redsky-manager-rolebinding unchanged
namespace/redsky-system unchanged
deployment.apps/redsky-controller-manager unchanged
clusterrole.rbac.authorization.k8s.io/redsky-patching-role unchanged
clusterrolebinding.rbac.authorization.k8s.io/redsky-patching-rolebinding unchanged
secret/redsky-manager configured

5. Verify if the redsky-controller-manager is running on k8s cluster

kubectl get pod -n redsky-system
NAME                                         READY   STATUS    RESTARTS   AGE
redsky-controller-manager-6cbb796b79-5wnz8   1/1     Running   1          4d4h

6. Authorize the kubernetes cluster, where user want to run the experiment and application is present

redskyctl authorize-cluster
secret/redsky-manager configured
deployment.apps/redsky-controller-manager patched

Install Demo Application and Configure T4K Resources

1. There must be an application running in the namespace which will be used to perform the backup by StormForge experiment

kubectl get pods | grep k8s-demo-app
k8s-demo-app-frontend-7c4bdbf9b-bbz2z                            1/1     Running     1          4d4h
k8s-demo-app-frontend-7c4bdbf9b-c84m7                            1/1     Running     1          4d4h
k8s-demo-app-frontend-7c4bdbf9b-z8jhx                            1/1     Running     1          4d4h
k8s-demo-app-mysql-754f46dbd7-v85mh                              1/1     Running     2          4d4h

2. It is expected that Trilio for Kubernetes product is installed on the K8s cluster

kubectl get pods | grep k8s-triliovault
k8s-triliovault-admission-webhook-7c5b454ff4-85t92               1/1     Running     1          4d3h
k8s-triliovault-control-plane-54594f5796-vpz4r                   2/2     Running     2          4d3h
k8s-triliovault-exporter-86d94c9967-bx56s                        1/1     Running     1          4d3h
k8s-triliovault-ingress-gateway-6bdfc75c9b-l5qvb                 1/1     Running     1          4d3h
k8s-triliovault-web-76d84fdcc5-lj52v                             1/1     Running     1          4d3h
k8s-triliovault-web-backend-7fb8784588-8v865                     1/1     Running     1          4d3h
triliovault-operator-k8s-triliovault-operator-66b6d9d96f-k8fjs   1/1     Running     1          4d4h

3. Verify that T4K License is applied to initiate the backup

kubectl get license
NAME             STATUS   MESSAGE                                   CURRENT NODE COUNT   GRACE PERIOD END TIME   EDITION   CAPACITY   EXPIRATION TIME
test-license-1   Active   Cluster License Activated successfully.   9                                            Basic     10         2022-04-30T00:0

4. Make sure that the Target to store the backup is configured and is in Available state

kubectl get target
NAME             TYPE          THRESHOLD CAPACITY   VENDOR   STATUS      BROWSING ENABLED
demo-s3-target   ObjectStore   100Gi                AWS      Available   

5. Create a Backup plan for the application or namespace

kubectl apply -f app/mysql-sample-backupplan.yaml
kubectl apply -f app/mysql-sample-backupplan-1.yaml

6. Verify the backup plan is created correctly

kubectl get backupplan
NAME                       TARGET           RETENTION POLICY   INCREMENTAL SCHEDULE   FULL BACKUP SCHEDULE   STATUS
mysql-label-backupplan     demo-s3-target                                                                    Available
mysql-label-backupplan-1   demo-s3-target                                                                    Available

Configure and Monitor StormForge Experiment

  1. Users need to create a backup, bash script to monitor the backup and tvk-manager configmap.

kubectl create configmap bashscript --from-file=configmaps/bash-script.sh
kubectl create configmap bashscript1 --from-file=configmaps/bash-script-1.sh

kubectl create configmap backup --from-file=configmaps/mysql-sample-backup.yaml
kubectl create configmap backup1 --from-file=configmaps/mysql-sample-backup-1.yaml

kubectl create configmap tvkmanager --from-file=configmaps/tvk-manager.yaml

2. User has all required entities in place to start the experiment

kubectl apply -f experiment/tvk-scale-2-backup-delete-parallel-prod.yaml

3. User can monitor the running experiment being performed for each backup.

kubectl get experiment
NAME                                        STATUS
tvk-scale-400-backup-delete-parallel-prod   Running

4. Check the new Trial running by the experiment

kubectl get trial
NAME                                            STATUS       ASSIGNMENTS                                                                                            VALUES
tvk-scale-400-backup-delete-parallel-prod-000   Setting up   deploymentMemory=512, deploymentCpu=250, metaCpu=500, metaMemory=512, dataMemory=1536, dataCpu=1200    

5. Verify the jobs

kubectl get jobs
NAME                                            COMPLETIONS   DURATION   AGE
tvk-scale-2-backup-delete-parallel-prod-001          0/1           4d3h       4d3h
tvk-scale-2-backup-delete-parallel-prod-001-create   1/1           51s        4d3h
tvk-scale-2-backup-delete-parallel-prod-001-delete   1/1           6s         4d3h

Monitor the Experiment from StormForge UI

  1. After the experiment is in running state, user can view the progress at the StormForge UI using the login they have configured.

2. Once the experiment run is complete, StormForge experiment will show the recommended configuration with resource details.

Delete / Remove StormForge Experiment

  1. After the experiment run is complete, user can remove the StormForge redsky-controller-manager with simple commands

  • Delete the experiment from kubernetes cluster along with all other resources

kubectl apply -f experiment/tvk-scale-2-backup-delete-parallel-prod.yaml
  • Delete the experiment from redsky tenant

redskyctl delete exp tvk-scale-2-backup-delete-parallel-prod

Note: If you delete the experiment from redsky tenant, it will get deleted from the StormForge UI as well.

2. If user is facing any issues while running the experiment, they can check logs of redsky-controller-manager

kubectl logs redsky-controller-manager-6cbb796b79-5wnz8 -n redsky-system

Conclusion

Using StormForge, users can create desired experiments. Users run those experiments on Kubernetes cluster with applications running and perform operations like Backup/Restore. StormForge UI will provide the analysis through a well curated graph. It will show the best to worst combination of resources used to perform the operation in a particular trial. These optimized resource combinations can be applied in the production clusters to achieve the desired RPO/RTO.