LogoLogo
5.0.X
5.0.X
  • About Trilio for Kubernetes
    • Welcome to Trilio For Kubernetes
    • Version 5.0.X Release Highlights
    • Compatibility Matrix
    • Marketplace Support
    • Features
    • Use Cases
  • Getting Started
    • Getting Started with Trilio on Red Hat OpenShift (OCP)
    • Getting Started with Trilio for Upstream Kubernetes (K8S)
    • Getting Started with Trilio for AWS Elastic Kubernetes Service (EKS)
    • Getting Started with Trilio on Google Kubernetes Engine (GKE)
    • Getting Started with Trilio on VMware Tanzu Kubernetes Grid (TKG)
    • More Trilio Supported Kubernetes Distributions
      • General Installation Prerequisites
      • Rancher Deployments
      • Azure Cloud AKS
      • Digital Ocean Cloud
      • Mirantis Kubernetes Engine
      • IBM Cloud
    • Licensing
    • Using Trilio
      • Overview
      • Post-Install Configuration
      • Management Console
        • About the UI
        • Navigating the UI
          • UI Login
          • Cluster Management (Home)
          • Backup & Recovery
            • Namespaces
              • Namespaces - Actions
              • Namespaces - Bulk Actions
            • Applications
              • Applications - Actions
              • Applications - Bulk Actions
            • Virtual Machines
              • Virtual Machine -Actions
              • Virtual Machine - Bulk Actions
            • Backup Plans
              • Create Backup Plans
              • Backup Plans - Actions
            • Targets
              • Create New Target
              • Targets - Actions
            • Hooks
              • Create Hook
              • Hooks - Actions
            • Policies
              • Create Policies
              • Policies - Actions
          • Monitoring
          • Guided Tours
        • UI How-to Guides
          • Multi-Cluster Management
          • Creating Backups
            • Pause Schedule Backups and Snapshots
            • Cancel InProgress Backups
            • Cleanup Failed Backups
          • Restoring Backups & Snapshots
            • Cross-Cluster Restores
            • Namespace & application scoped
            • Cluster scoped
          • Disaster Recovery Plan
          • Continuous Restore
      • Command-Line Interface
        • YAML Examples
        • Trilio Helm Operator Values
    • Upgrade
    • Air-Gapped Installations
    • Uninstall
  • Reference Guides
    • T4K Pod/Job Capabilities
      • Resource Quotas
    • Trilio Operator API Specifications
    • Custom Resource Definition - Application
  • Advanced Configuration
    • AWS S3 Target Permissions
    • Management Console
      • KubeConfig Authenticaton
      • Authentication Methods Via Dex
      • UI Authentication
      • RBAC Authentication
      • Configuring the UI
    • Resource Request Requirements
      • Fine Tuning Resource Requests and Limits
    • Observability
      • Observability of Trilio with Prometheus and Grafana
      • Exported Prometheus Metrics
      • Observability of Trilio with Openshift Monitoring
      • T4K Integration with Observability Stack
    • Modifying Default T4K Configuration
  • T4K Concepts
    • Supported Application Types
    • Support for Helm Releases
    • Support for OpenShift Operators
    • T4K Components
    • Backup and Restore Details
      • Immutable Backups
      • Application Centric Backups
    • Retention Process
      • Retention Use Case
    • Continuous Restore
      • Architecture and Concepts
  • Performance
    • S3 as Backup Target
      • T4K S3 Fuse Plugin performance
    • Measuring Backup Performance
  • Ecosystem
    • T4K Integration with Slack using BotKube
    • Monitoring T4K Logs using ELK Stack
    • Rancher Navigation Links for Trilio Management Console
    • Optimize T4K Backups with StormForge
    • T4K GitHub Runner
    • AWS RDS snapshots using T4K hooks
    • Deploying Trilio For Kubernetes with Openshift ACM Policies
  • Krew Plugins
    • T4K QuickStart Plugin
    • Trilio for Kubernetes Preflight Checks Plugin
    • T4K Log Collector Plugin
    • T4K Cleanup Plugin
  • Support
    • Troubleshooting Guide
    • Known Issues and Workarounds
    • Contacting Support
  • Appendix
    • Ignored Resources
    • OpenSource Software Disclosure
    • CSI Drivers
      • Installing VolumeSnapshot CRDs
      • Install AWS EBS CSI Driver
    • T4K Product Quickview
    • OpenShift OperatorHub Custom CatalogSource
      • Custom CatalogSource in a restricted environment
    • Configure OVH Object Storage as a Target
    • Connect T4K UI hosted with HTTPS to another cluster hosted with HTTP or vice versa
    • Fetch DigitalOcean Kubernetes Cluster kubeconfig for T4K UI Authentication
    • Force Update T4K Operator in Rancher Marketplace
    • Backup and Restore Virtual Machines running on OpenShift
    • T4K For Volumes with Generic Storage
    • T4K Best Practices
Powered by GitBook
On this page
  • Introduction
  • Install and configure StormForge with T4K
  • Conclusion

Was this helpful?

  1. Ecosystem

Optimize T4K Backups with StormForge

This article helps users to install and configure StormForge. It can be used find the optimized resource configuration of T4K to run the backup

Introduction

Today many enterprises running Kubernetes applications are not aware of a way to monitor and optimize the resources used by those applications.

StormForge is a platform that helps users to automate resource tuning and deliver the best Kubernetes application performance at the lowest possible cost. StormForge allows users to run an experiment with a user specified number of trials. The trial runs on an application to find the optimized resource configuration required to perform certain operations. It uses machine learning algorithms to pick the combination of CPU, RAM and other resources to run a trial.

Install and configure StormForge with T4K

Follow below instructions to install and configure the StormForge with Trilio for Kubernetes and run the experiment to find the optimized resource configuration of T4K

Install and Configure StormForge redsky-controller-manager with K8s Cluster

  1. Download and install redskyctl CLI

curl -L https://app.stormforge.io/downloads/redskyctl-linux-amd64.tar.gz | tar x
sudo mv redskyctl /usr/local/bin/

2. Authorize the linux server from where redskyctl will be evoked to run the experiment

redskyctl login --force

Enter the URL generated by above command into the browser to authorize the Linux server

https://auth.carbonrelay.io/authorize?audience=https%3A%2F%2Fapi.carbonrelay.io%2Fv1%2F&client_id=pE3kMKdrMTdW4DOxQHesyAuFGNOWaEke&code_challenge=nUaqSbjv0grweFAMrv_Zk9KBg_8o-4AQAnwxpijJlGk&code_challenge_method=S256&redirect_uri=http%3A%2F%2F127.0.0.1%3A8085%2F&response_type=code&scope=register%3Aclients+offline_access&state=nHGW89zTYO39loEALGTsig

3. Verify if you are connected to the server where you want to run the experiment

kubectl get nodes
NAME                                          STATUS   ROLES               AGE    VERSION
ip-mytesthost-w1.us-east-2.compute.internal   Ready    worker              4d5h   v1.19.9
ip-mytesthost-w2.us-east-2.compute.internal   Ready    worker              4d5h   v1.19.9
ip-mytesthost-m1.us-east-2.compute.internal   Ready    controlplane,etcd   4d5h   v1.19.9
ip-mytesthost-m2.us-east-2.compute.internal   Ready    controlplane,etcd   4d5h   v1.19.9
ip-mytesthost-w3.us-east-2.compute.internal   Ready    worker              4d5h   v1.19.9
ip-mytesthost-m3.us-east-2.compute.internal   Ready    controlplane,etcd   4d5h   v1.19.9

4. Initiate the redsky controller manager pod on the kubernetes cluster

redskyctl init
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/experiments.redskyops.dev configured
customresourcedefinition.apiextensions.k8s.io/trials.redskyops.dev configured
clusterrole.rbac.authorization.k8s.io/redsky-manager-role configured
clusterrolebinding.rbac.authorization.k8s.io/redsky-manager-rolebinding unchanged
namespace/redsky-system unchanged
deployment.apps/redsky-controller-manager unchanged
clusterrole.rbac.authorization.k8s.io/redsky-patching-role unchanged
clusterrolebinding.rbac.authorization.k8s.io/redsky-patching-rolebinding unchanged
secret/redsky-manager configured

5. Verify if the redsky-controller-manager is running on k8s cluster

kubectl get pod -n redsky-system
NAME                                         READY   STATUS    RESTARTS   AGE
redsky-controller-manager-6cbb796b79-5wnz8   1/1     Running   1          4d4h

6. Authorize the kubernetes cluster, where user want to run the experiment and application is present

redskyctl authorize-cluster
secret/redsky-manager configured
deployment.apps/redsky-controller-manager patched

Install Demo Application and Configure T4K Resources

1. There must be an application running in the namespace which will be used to perform the backup by StormForge experiment

kubectl get pods | grep k8s-demo-app
k8s-demo-app-frontend-7c4bdbf9b-bbz2z                            1/1     Running     1          4d4h
k8s-demo-app-frontend-7c4bdbf9b-c84m7                            1/1     Running     1          4d4h
k8s-demo-app-frontend-7c4bdbf9b-z8jhx                            1/1     Running     1          4d4h
k8s-demo-app-mysql-754f46dbd7-v85mh                              1/1     Running     2          4d4h

2. It is expected that Trilio for Kubernetes product is installed on the K8s cluster

kubectl get pods | grep k8s-triliovault
k8s-triliovault-admission-webhook-7c5b454ff4-85t92               1/1     Running     1          4d3h
k8s-triliovault-control-plane-54594f5796-vpz4r                   2/2     Running     2          4d3h
k8s-triliovault-exporter-86d94c9967-bx56s                        1/1     Running     1          4d3h
k8s-triliovault-ingress-gateway-6bdfc75c9b-l5qvb                 1/1     Running     1          4d3h
k8s-triliovault-web-76d84fdcc5-lj52v                             1/1     Running     1          4d3h
k8s-triliovault-web-backend-7fb8784588-8v865                     1/1     Running     1          4d3h
triliovault-operator-k8s-triliovault-operator-66b6d9d96f-k8fjs   1/1     Running     1          4d4h

3. Verify that T4K License is applied to initiate the backup

kubectl get license
NAME             STATUS   MESSAGE                                   CURRENT NODE COUNT   GRACE PERIOD END TIME   EDITION   CAPACITY   EXPIRATION TIME
test-license-1   Active   Cluster License Activated successfully.   9                                            Basic     10         2022-04-30T00:0

4. Make sure that the Target to store the backup is configured and is in Available state

kubectl get target
NAME             TYPE          THRESHOLD CAPACITY   VENDOR   STATUS      BROWSING ENABLED
demo-s3-target   ObjectStore   100Gi                AWS      Available   

5. Create a Backup plan for the application or namespace

kubectl apply -f app/mysql-sample-backupplan.yaml
kubectl apply -f app/mysql-sample-backupplan-1.yaml

6. Verify the backup plan is created correctly

kubectl get backupplan
NAME                       TARGET           RETENTION POLICY   INCREMENTAL SCHEDULE   FULL BACKUP SCHEDULE   STATUS
mysql-label-backupplan     demo-s3-target                                                                    Available
mysql-label-backupplan-1   demo-s3-target                                                                    Available

Configure and Monitor StormForge Experiment

  1. Users need to create a backup, bash script to monitor the backup and tvk-manager configmap.

kubectl create configmap bashscript --from-file=configmaps/bash-script.sh
kubectl create configmap bashscript1 --from-file=configmaps/bash-script-1.sh

kubectl create configmap backup --from-file=configmaps/mysql-sample-backup.yaml
kubectl create configmap backup1 --from-file=configmaps/mysql-sample-backup-1.yaml

kubectl create configmap tvkmanager --from-file=configmaps/tvk-manager.yaml

2. User has all required entities in place to start the experiment

kubectl apply -f experiment/tvk-scale-2-backup-delete-parallel-prod.yaml

3. User can monitor the running experiment being performed for each backup.

kubectl get experiment
NAME                                        STATUS
tvk-scale-400-backup-delete-parallel-prod   Running

4. Check the new Trial running by the experiment

kubectl get trial
NAME                                            STATUS       ASSIGNMENTS                                                                                            VALUES
tvk-scale-400-backup-delete-parallel-prod-000   Setting up   deploymentMemory=512, deploymentCpu=250, metaCpu=500, metaMemory=512, dataMemory=1536, dataCpu=1200    

5. Verify the jobs

kubectl get jobs
NAME                                            COMPLETIONS   DURATION   AGE
tvk-scale-2-backup-delete-parallel-prod-001          0/1           4d3h       4d3h
tvk-scale-2-backup-delete-parallel-prod-001-create   1/1           51s        4d3h
tvk-scale-2-backup-delete-parallel-prod-001-delete   1/1           6s         4d3h

Monitor the Experiment from StormForge UI

2. Once the experiment run is complete, StormForge experiment will show the recommended configuration with resource details.

Delete / Remove StormForge Experiment

  1. After the experiment run is complete, user can remove the StormForge redsky-controller-manager with simple commands

  • Delete the experiment from kubernetes cluster along with all other resources

kubectl apply -f experiment/tvk-scale-2-backup-delete-parallel-prod.yaml
  • Delete the experiment from redsky tenant

redskyctl delete exp tvk-scale-2-backup-delete-parallel-prod

Note: If you delete the experiment from redsky tenant, it will get deleted from the StormForge UI as well.

2. If user is facing any issues while running the experiment, they can check logs of redsky-controller-manager

kubectl logs redsky-controller-manager-6cbb796b79-5wnz8 -n redsky-system

Conclusion

Using StormForge, users can create desired experiments. Users run those experiments on Kubernetes cluster with applications running and perform operations like Backup/Restore. StormForge UI will provide the analysis through a well curated graph. It will show the best to worst combination of resources used to perform the operation in a particular trial. These optimized resource combinations can be applied in the production clusters to achieve the desired RPO/RTO.

PreviousRancher Navigation Links for Trilio Management ConsoleNextT4K GitHub Runner

Was this helpful?

After the experiment is in running state, user can view the progress at the UI using the login they have configured.

StormForge
StormForge Experiment Trial Run Graph
StormForge Recommended Resource Configuration