Supported Application Types

Trilio for Kubernetes is Trilio Data's foray into the Kubernetes market providing a cloud-native data protection approach for cloud native applications.

Legacy Applications

IT is under constant transition; Mainframes to Open Systems to virtualization, business applications are undergoing constant change to improve efficiencies, performance and time to market. Data protection is one of the important functions empowering IT with the ability to protect applications against unforeseen outages. Today, there are numerous market proven solutions that provide great backup and recovery functionality for legacy applications that either include bare metal servers or virtualization. However, these solutions cannot address cloud native applications which are mostly deployed as microservices. Therefore, backing up a few computers or virtual machines is not an effective solution to address the sprawling needs of cloud native applications.

Cloud Native Applications

Cloud Native Computing Foundation (CNCF) defines cloud-native as “scalable applications” running in “modern dynamic environments” that use technologies such as containers, microservices, and declarative APIs. A cloud native application may contain 10s to 100s of containers and their associated resources that work in unison to accomplish defined business functionality. For example, containers may be spread across multiple compute nodes and without affinity to the underlying infrastructure. Individual containers within a cloud native application do not have much application context The collection of related containers and their resources must be treated as a single "application".

Since cloud native applications may contain different containers running from different images with different metadata settings, manually building and maintaining these applications is a tedious job. Although Kubernetes defines few building blocks such as replica sets, deployments, stateful sets, etc. to build complex applications, Kubernetes supports additional tools such as Helm charts and Operators to implement life cycle of cloud native applications.

In the next few sections we will discuss how Trilio backup and recovery operations are defined within Kubernetes applications and how it enhances the overall application lifecycle management with its backup and recovery operations.

Helm

Helm has three main concepts:

A Chart is a Helm package. It contains all of the resource definitions necessary to run an application, tool, or service inside of a Kubernetes cluster.

A Repository is the place where charts can be collected and shared.

A Release is an instance of a chart running in a Kubernetes cluster. One chart can often be installed many times into the same cluster. And each time it is installed, a new release is created. Consider a MySQL chart. If you want two databases running in your cluster, you can install that chart twice. Each one will have its own release, which will in turn have its own release name.

$ helm install happy-panda stable/mariadb
Fetched stable/mariadb-0.3.0 to /Users/mattbutcher/Code/Go/src/helm.sh/helm/mariadb-0.3.0.tgz
happy-panda
Last Deployed: Wed Sep 28 12:32:28 2016
Namespace: default
Status: DEPLOYED

Resources:
==> extensions/Deployment
NAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
happy-panda-mariadb   1         0         0            0           1s

==> v1/Secret
NAME                     TYPE      DATA      AGE
happy-panda-mariadb   Opaque    2         1s

==> v1/Service
NAME                     CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
happy-panda-mariadb   10.0.0.70    <none>        3306/TCP   1s


Notes:
MariaDB can be accessed via port 3306 on the following DNS name from within your cluster:
happy-panda-mariadb.default.svc.cluster.local

In the above example a new release called happy-panda is created from a helm chart stable/mariadb using helm install command. Helm created various resources including deployment, secrets and service objects as part of release creation.

When a new version of a chart is released, or when you want to change the configuration of your release, you can use the helm upgrade command.

$ helm upgrade -f panda.yaml happy-panda stable/mariadb
Fetched stable/mariadb-0.3.0.tgz to /Users/mattbutcher/Code/Go/src/helm.sh/helm/mariadb-0.3.0.tgz
happy-panda has been upgraded. Happy Helming!
Last Deployed: Wed Sep 28 12:47:54 2016
Namespace: default
Status: DEPLOYED
...

Helm supports additional commands such as rollback of an upgrade or uninstalling a release.

Application instances in Helm's world is a release. Just as each Helm operation is defined for a given Release, Trilio defines its backup and operations at a given Helm Release. You can define a Trilio backup job for a release and restore a release from Trilio backups. From the above example, you can restore happy-panda release from a Trilio backup and then call helm upgrade to upgrade the release to a new revision.

Operator

According to Kubernetes an Operator is defined as the following:

The Operator pattern aims to capture the key aim of a human operator who is managing a service or set of services. Human operators who look after specific applications and services have deep knowledge of how the system ought to behave, how to deploy it, and how to react if there are problems.

People who run workloads on Kubernetes often like to use automation to take care of repeatable tasks. The Operator pattern captures how you can write code to automate a task beyond what Kubernetes itself provides.

The Kubernetes community has plethora of Operators for managing various applications. These are primarily stateful applications which otherwise cannot fully be managed using Kubernetes building blocks such as deployments, replica sets or stateful sets. These Operators have an innate knowledge of the operational aspects of its underlying application and will provide complete life cycle management of the application. For example, an Operator for a Galera cluster knows how to adjust a number of replicas, how to recover a cluster or a member, and how to tune various operational parameters.

Lets walk through the sequence of steps to deploy an Operator based application. Here we took an example memcached operator to aid our discussion.

Before running the Operator, the CRD must be registered with the Kubernetes apiserver:

$ kubectl create -f deploy/crds/cache.example.com_memcacheds_crd.yaml
$ cat deploy/service_account.yaml
serviceAccountName: memcached-operator
containers:
- name: memcached-operator
  # Replace this with the built image name
  image: quay.io/<user>/memcached-operator:v0.0.1
  command:
  - memcached-operator
  imagePullPolicy: Always
  
$ kubectl create -f deploy/service_account.yaml
$ kubectl create -f deploy/role.yaml
$ kubectl create -f deploy/role_binding.yaml
$ kubectl create -f deploy/operator.yaml

Verify that the memcached-operator deployment is up and running:

$ kubectl get deployment
NAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
memcached-operator       1         1         1            1           1m

Create the example Memcached CR that was generated at deploy/crds/cache.example.com_v1alpha1_memcached_cr.yaml:

$ cat deploy/crds/cache.example.com_v1alpha1_memcached_cr.yaml
apiVersion: "cache.example.com/v1alpha1"
kind: "Memcached"
metadata:
  name: "example-memcached"
spec:
  size: 3

$ kubectl apply -f deploy/crds/cache.example.com_v1alpha1_memcached_cr.yaml

Ensure that the memcached-operator creates the deployment for the CR:

$ kubectl get deployment
NAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
memcached-operator       1         1         1            1           2m
example-memcached        3         3         3            3           1m

Check the pods and CR status to confirm the status is updated with the memcached pod names:

$ kubectl get pods
NAME                                  READY     STATUS    RESTARTS   AGE
example-memcached-6fd7c98d8-7dqdr     1/1       Running   0          1m
example-memcached-6fd7c98d8-g5k7v     1/1       Running   0          1m
example-memcached-6fd7c98d8-m7vn7     1/1       Running   0          1m
memcached-operator-7cc7cfdf86-vvjqk   1/1       Running   0          2m
$ kubectl get memcached/example-memcached -o yaml
apiVersion: cache.example.com/v1alpha1
kind: Memcached
metadata:
  clusterName: ""
  creationTimestamp: 2018-03-31T22:51:08Z
  generation: 0
  name: example-memcached
  namespace: default
  resourceVersion: "245453"
  selfLink: /apis/cache.example.com/v1alpha1/namespaces/default/memcacheds/example-memcached
  uid: 0026cc97-3536-11e8-bd83-0800274106a1
spec:
  size: 3
status:
  nodes:
  - example-memcached-6fd7c98d8-7dqdr
  - example-memcached-6fd7c98d8-g5k7v
  - example-memcached-6fd7c98d8-m7vn7

In the above example, example-memcached is the application instance and memcached-operator is the Operator. The Operator has a CRD definition and container image that includes the Operator controller code to manage Operator CRD.

Let's discuss the Operator based applications from a backup and restore operations viewpoint. In the above example, the application contains three pods at that given time. Backing up these three pods as part of the application backup and then restoring these three pods do not sufficiently addresses the example-memcached needs. The restored pods are not associated with the Operator and the Operator controller cannot be used to manage restored the application.

Trilio backups include the Operator CRD, its image, application CR and all the resources are created by the Operator as backup image. When a backup is restored, Trilio will restore the Operator (if not already present), create the application using the Operator and will restore the data from the backup images. Hence, Trilio restored applications can still be managed using an Operator. For example, user can grow the size of memcached 4 by using Operator functionality.

Label and Selectors based Applications

Helm Charts and Operators cover a wide range of applications deployments in Kubernetes. However, there may be applications that can be identified by Labels. Trilio supports backup and recovery of these applications as well.

Labels are key/value pairs that are attached to objects, such as PODs and PVs. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels can be used to organize and to select subsets of objects. Labels can be attached to objects at creation time and subsequently added and modified at any time. Each object can have a set of key/value labels defined. Each Key must be unique for a given object.

"metadata": {
  "labels": {
    "app" : "gcp-compute-persistent-disk-csi-driver",
    "componentname" : "comp1"
  }
}

In the above example, app and componentname are the attributes.

You can create backup plan based on labelSelectors. For a complete description of labelSelectors, please refer to Kubernetes documentation.

  backupPlanComponents:
    custom:
      matchLabels:
        app: gcp-compute-persistent-disk-csi-driver

Namespaces

Namespaces aim at providing multi-tenancy within a Kubernetes cluster. Being a natural concept of the Kubernetes cluster, often time users leverage namespaces within Kubernetes clusters as a boundary for their application or a logical group of applications. While not a direct correlation between application discovery/deployment, an indirect relationship does exist that makes it important to call out namespaces as an identifier or segregation mechanism for applications.

Virtual Machines on OpenShift Virtualization (KubeVirt)

OpenShift Virtualization is based on KubeVirt, an open source project that makes it possible to run VMs in a Kubernetes-managed container platform. KubeVirt delivers container-native virtualization by using Kernel-based Virtual Machine (KVM) within a Kubernetes container.

While containerization has become popular due to its scalability, speed, and consistency across different environments, there are several reasons why organizations might still choose to run their applications in Virtual Machines (VMs) alongside containers, but to name just a few:

  1. Legacy Applications: Many organizations have substantial investments in legacy applications designed and optimized to run on VMs. Rewriting or refactoring these applications to run in containers can be time-consuming, costly, and risky.

  2. Regulatory Compliance: Some industry regulations may require the use of VMs due to their enhanced isolation and security features.

  3. Compatibility: Some software may not be compatible with containers, or the effort to adapt it may outweigh the benefits.

Trilio aims to be your comprehensive backup and recovery solution, designed to safeguard OpenShift Virtualization workloads running right alongside your containerized workloads.

Features of Trilio for OpenShift Virtualization include:

  • The capability to backup, recover and migrate VMs using the same native tools and workflows you are using to backup other containerized workloads.

  • The flexibility to select backups on a per-VM, label, or Namespace basis.

  • The ability to restore individual VMs from Namespaces.

  • The provision to execute both full and incremental backups, conserving storage space and network bandwidth on the target storage.

Other Application Deployments

Application deployments are not limited to these three types. There are other third party tools such as Terraforms, Cloud Formations, etc. Trilio will continue to add support for these deployment models in the future.