LogoLogo
4.0.X
4.0.X
  • About Trilio for Kubernetes
    • Welcome to Trilio For Kubernetes
    • Version 4.0.X Release Highlights
    • Compatibility Matrix
    • Marketplace Support
    • Features
    • Use Cases
  • Getting Started
    • Getting Started with Trilio on Red Hat OpenShift (OCP)
    • Getting Started with Trilio for Upstream Kubernetes (K8S)
    • Getting Started with Trilio for AWS Elastic Kubernetes Service (EKS)
    • Getting Started with Trilio on Google Kubernetes Engine (GKE)
    • Getting Started with Trilio on VMware Tanzu Kubernetes Grid (TKG)
    • More Trilio Supported Kubernetes Distributions
      • General Installation Prerequisites
      • Rancher Deployments
      • Azure Cloud AKS
      • Digital Ocean Cloud
      • Mirantis Kubernetes Engine
      • IBM Cloud
    • Licensing
    • Using Trilio
      • Overview
      • Post-Install Configuration
      • Management Console
        • About the UI
        • Navigating the UI
          • UI Login
          • Cluster Management (Home)
          • Backup & Recovery
            • Namespaces
              • Namespaces - Actions
              • Namespaces - Bulk Actions
            • Applications
              • Applications - Actions
              • Applications - Bulk Actions
            • Backup Plans
              • Create Backup Plans
              • Backup Plans - Actions
            • Targets
              • Create New Target
              • Targets - Actions
          • Trilio Monitoring
          • Resource Management
          • Guided Tours
        • UI How-to Guides
          • Multi-Cluster Management
          • Creating Backups
            • Cleanup Failed Backups
          • Restoring Backups
            • Cross-Cluster Restores
          • Monitoring Details
          • Disaster Recovery Plan
          • Continuous Restore
      • Command-Line Interface
        • YAML Examples
        • Trilio Helm Operator Values
    • Upgrade
    • Air-Gapped Installations
    • Uninstall
  • Reference Guides
    • T4K Pod/Job Capabilities
      • Resource Quotas
    • Trilio Operator API Specifications
    • Custom Resource Definition - Application
  • Advanced Configuration
    • AWS S3 Target Permissions
    • Management Console
      • KubeConfig Authenticaton
      • Authentication Methods Via Dex
      • UI Authentication
      • RBAC Authentication
      • Configuring the UI
    • Resource Request Requirements
      • Fine Tuning Resource Requests and Limits
    • Observability
      • Observability of Trilio with Prometheus and Grafana
      • Exported Prometheus Metrics
      • T4K Integration with Observability Stack
    • Modifying Default T4K Configuration
  • T4K Concepts
    • Supported Application Types
    • Support for Helm Releases
    • Support for OpenShift Operators
    • T4K Components
    • Backup and Restore Details
      • Immutable Backups
      • Application Centric Backups
    • Backup Retention Process
      • Retention Use Case
    • Continuous Restore
      • Architecture and Concepts
  • Performance
    • S3 as Backup Target
      • T4K S3 Fuse Plugin performance
    • Measuring Backup Performance
  • Ecosystem
    • T4K Integration with Slack using BotKube
    • Monitoring T4K Logs using ELK Stack
    • Rancher Navigation Links for Trilio Management Console
    • Optimize T4K Backups with StormForge
    • T4K GitHub Runner
    • AWS RDS snapshots using T4K hooks
    • Deploying Trilio For Kubernetes with Openshift ACM Policies
  • Krew Plugins
    • T4K QuickStart Plugin
    • Trilio for Kubernetes Preflight Checks Plugin
    • T4K Log Collector Plugin
    • T4K Cleanup Plugin
    • OCP ETCD Plugin
    • RKE ETCD Plugin
  • Support
    • Troubleshooting Guide
    • Known Issues and Workarounds
    • Contacting Support
  • Appendix
    • Ignored Resources
    • OpenSource Software Disclosure
    • CSI Drivers
      • Installing VolumeSnapshot CRDs
      • Install AWS EBS CSI Driver
    • T4K Product Quickview
    • OpenShift OperatorHub Custom CatalogSource
      • Custom CatalogSource in a restricted environment
    • Configure OVH Object Storage as a Target
    • Connect T4K UI hosted with HTTPS to another cluster hosted with HTTP or vice versa
    • Fetch DigitalOcean Kubernetes Cluster kubeconfig for T4K UI Authentication
    • Force Update T4K Operator in Rancher Marketplace
    • Backup and Restore Virtual Machines running on OpenShift
    • T4K For Volumes with Generic Storage
Powered by GitBook
On this page
  • Introduction to ETCD
  • OCP Cluster Backup & Disaster Recovery (DR)
  • ETCD Backup and Restore Using ocp-etcd-backup-restore Plugin
  • Plugin Prerequisites
  • Installation, Upgrade, Removal of Plugins :
  • Usage
  • Flags:
  • Examples:
  • Important Additional Information
  1. Krew Plugins

OCP ETCD Plugin

The plugin helps the user to perform ETCD backup and restore of OCP clusters.

PreviousT4K Cleanup PluginNextRKE ETCD Plugin

Introduction to ETCD

ETCD is the persistent data store for Kubernetes. It is a distributed key-value store that records the state of all resources in a Kubernetes cluster and it is simple, fast and secure. It acts like a backend service discovery and database. It runs on different servers in Kubernetes clusters at the same time, which enables it to monitor changes in clusters and store state/configuration data that are to be accessed by a Kubernetes master or clusters.

OCP Cluster Backup & Disaster Recovery (DR)

ETCD data must be backed up before shutting down a cluster. ETCD is the key-value store for OpenShift Container Platform, which persists the state of all resource objects. Subsequently, ETCD backup plays a crucial role in disaster recovery. There are several situations where OpenShift Container Platform does not work as expected, such as:

  • You have a cluster that is not functional following a restart because of unexpected conditions, such as node failure, or network connectivity issues.

  • You have deleted something critical in the cluster by mistake.

  • You have lost the majority of your control plane hosts, leading to ETCD quorum loss.

In disaster situations like above, you can always recover by restoring your cluster to its previous state using the saved ETCD snapshots. Some important considerations to keep in mind about OCP Cluster Backups and DR:

  • Disaster recovery/Restore requires you to have at least one healthy control plane host (also known as the master host). User should run this plugin on bastion node if user wants to perform restore.[Bastion host is the host which is created using same network as the cluster and can ping the nodes of cluster.] More information around bastion node -

  • User has to only create bastion node which should be accessed using ssh. This plugin will itself create ssh connectivity from bastion to cluster nodes.

Source of information -

ETCD Backup and Restore Using ocp-etcd-backup-restore Plugin

The plugin helps the user to perform ETCD backup and restore of OCP clusters. If a user has lost some crucial cluster information, then they can restore from the snapshot saved using this plugin. If the user has lost nodes, they must recreate all the non-recovery control plane machines and then run '-p' option from this plugin to redeploy ETCD. Some important considerations to keep in mind about the plugin:

  • The plugin supports s3 as backup target

  • Restore functionality will only work on same cluster from where the backup was taken

  • Please do not switch of any node in cluster while restore is in progress and do not abort restore task in between, else you may loose cluster accessibility

Plugin Prerequisites

Prerequisites
  1. Linux or macOS are supported. Windows is not supported at this time.

Installation, Upgrade, Removal of Plugins :

Action
Command

Add the T4K custom plugin index of krew

Perform the installation

kubectl krew install tvk-interop-plugin/ocp-etcd-backup-restore

Upgrade the plugin

kubectl krew upgrade ocp-etcd-backup-restore

Uninstall the plugin

kubectl krew uninstall ocp-etcd-backup-restore

If the krew plugin manager is not an option, you may still install the ocp-etcd-backup-restore plugin without krew using the following steps:

  1. Choose a version of preflight plugin to install and check if release assets have preflight plugin's package[ocp-etcd-backup-restore-${OS}.tar.gz]

  2. Set env variable version TVK_OCP_ETCD_BACKUP_RESTORE_VERSION=[INSERT VERSION HERE]. If the version is not exported, then the latest tagged version will be considered.

  3. Run this Bash or ZSH shells command to download and install the ocp-etcd-backup-restore plugin without krew:

    (
      set -ex; cd "$(mktemp -d)" &&
      OS="$(uname)" &&
      if [[ -z ${TVK_OCP_ETCD_BACKUP_RESTORE_VERSION} ]]; then version=$(curl -s https://api.github.com/repos/trilioData/tvk-interop-plugins/releases/ | grep -oP '"tag_name": "\K(.*)(?=")'); fi &&
      echo "Installing version=${TVK_OCP_ETCD_BACKUP_RESTORE_VERSION}" &&
      package_name="ocp-etcd-backup-restore-${OS}.tar.gz" &&
      curl -fsSLO "https://github.com/trilioData/tvk-interop-plugins/releases/download/"${TVK_OCP_ETCD_BACKUP_RESTORE_VERSION}"/${package_name}" &&
      tar zxvf ${package_name} && sudo mv ocp-etcd-backup-restore /usr/local/bin/kubectl-ocp_etcd_backup_restore
    )
  4. Verify installation using the command: kubectl ocp-etcd-backup-restore --help

Usage

ETCD Backup and restore on OCP. Available flags: -backup -restore.
   [-h] [-backup] [-restore] [--target-name TARGET_NAME]
   [--target-namespace TARGET_NAMESPACE] --api-server-url API_SERVER_URL
   --ocp-cluster-user OCP_CLUSTER_USER --ocp-cluster-pass OCP_CLUSTER_PASS
   [-p] [--log-location LOG_LOC]

Flags:

Flag
Argument Details

-backup

Flag to notify the plugin to perform a backup.

-restore

Flag to notify the plugin to perform a restore.

--target-name

The name of a single datastore on which ETCD backup is to be stored. The target should be s3 and created in same namespace in which T4K resides and it should be available. This argument is mandatory if -backup flag is provided.

--target-namespace

Namespace name where the target resides or T4K is installed. This argument is mandatory if -backup flag is provided.

--api-server-url

Api server URL to login cluster. It follows this format: https://api.<cluster_name>.<domain>:6443" To check if URL is correct, use this command to check if it works: "oc login <api-server-url> -u <username> -p <password>" This is a mandatory argument.

--ocp-cluster-user

Username to access/login the OCP cluster. This is mandatory.

--ocp-cluster-pass

Password for the --ocp-cluster-user to access/login the OCP cluster. This is mandatory.

-p

Denotes or notify plugin to perform post restore tasks.

--log-location

Log file name along with path where the logs should be saved. Default: /tmp/etcd-ocp-backup.log

Examples:

A user may specify more than one option with each command execution. For example, to create a backup with a configured target name and associated namespace, and to set the cluster API URL with the associated username and password, execute the following single command:

kubectl ocp-etcd-backup-restore -backup --target-name <target_ns> --target-namespace <target_ns> --api-server-url "https://api.<clustername>.<domain>:6443" --ocp-cluster-user <user> --ocp-cluster-pass "<password>"

Then, to restore from the same cluster API URL with the associated username and password, execute the following single command:

kubectl ocp-etcd-backup-restore -restore --api-server-url "https://api.<clustername>.<domain>:6443" --ocp-cluster-user <user> --ocp-cluster-pass "<passwd>"

Restoring to a previous cluster state is a destructive and destabilizing action to take on a running cluster. This procedure should only be used as a last resort.

Important Additional Information

  1. If restoring the backup which is a different T4K version than the one you are currently using, the operation fails and cluster accessibility is lost. The workaround is to delete the current T4K and then re-try restoring.

  2. As per official Red Hat documentation, "Restoring to a previous cluster state is a destructive and destabilizing action to take on a running cluster. This should only be used as a last resort." If you are able to retrieve data using the Kubernetes API server, then ETCD is available and you should not restore using an ETCD backup.

  3. Supported GLIBC version: ≥ 2.27 Plugin is tested on OCP versions: 4.8 and 4.9

krew - kubectl-plugin manager. Install from .

kubectl - Kubernetes command-line tool. Install from .

Trilio for Kubernetes and a backup target. Install from .

oc - Install from .

kubectl krew index add tvk-interop-plugin

Navigate to this list of available releases: .

Please refer to Red Hat's for more information.

https://docs.openshift.com/container-platform/4.7/networking/accessing-hosts.html
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/backup_and_restore/index
here
here
here
here
https://github.com/trilioData/tvk-interop-plugins/releases
Backing up and restoring your OpenShift Container Platform Cluster
https://github.com/trilioData/tvk-interop-plugins.git