Search…
Known Issues and Workarounds
This page describes issues that may be encountered and potential solutions to those issues.

Management Console not accessible using NodePort

If user has done all the configuration mentioned as per the Accessing the Management Console instructions and still not able to access the UI, check the Firewall Rules for the Kubernetes cluster nodes.
Here is an example to check firewall rules on the Google GKE cluster:
  1. 1.
    Search the VPC Network on GCP web console
  2. 2.
    Select `Firewall` option from the left pane
Select Firewall from VPC Network
  1. 1.
    Search the rules with your Kubernetes cluster name
  2. 2.
    In the list of firewalls, verify:
    1. 1.
      Filters column - Shows the source IP which can access the TVK Management Console Hostname and NodePort
    2. 2.
      Protocols/Ports column - Shows the ports which are accessible on the cluster nodes.
  3. 3.
    Verify that the NodePort assigned to service k8s-triliovault-ingress-gateway is added in the firewall rules Protocols and Ports.
Filters showing IP range and Protols/Ports showing exposed Port

Target Creation Failure

kubectl apply -f <backup_target.yaml> is not successful and the resource goes into Unavailable state.
Possible reasons for the target marked Unavailable could be not having enough permissions or lack of network connectivity. Target resource creation launches a validation pod to check if TrilioVault can connect and write to the Target. TrilioVault then cleans the pod after the validation process completes, which clears any logs that the validation pod has. Users should repeat the same operation and actively check the logs on the validation pods.

Kubernetes

1
kubectl get pods -A | grep validator
2
kubectl logs <validator-pod> -n <namespace>
Copied!

OpenShift

1
oc get pods -A | grep validator
2
oc logs <validator-pod> -n <namespace
Copied!

Backup Failures

Please follow the following steps to analyze backup failures.
1
kubectl get jobs
2
oc get jobs
Copied!
To troubleshoot backup issues, it is important to know the different phases of jobs.triliovault pod.
  1. 1.
    <backupname>-snapshot: TrilioVault leverages CSI snapshot functionality to take PV snapshots. If the backup fails in this step user should try a manual snapshot of the PV to make sure all required drivers are present, and that this operation works without TrilioVault.
  2. 2.
    <backupname>-upload: In this phase, TrilioVault uploads metadata and data to the Target. If the backup fails in this phase, we need to look into the following logs for error.

Kubernetes

1
kubectl get pods -A | grep k8s-triliovault-control-plane
2
kubectl logs <controller POD>
3
kubectl logs <failed or errored datamover pod>
Copied!

OpenShift

1
oc get pods -A | grep k8s-triliovault-control-plane
2
oc logs <controller POD>
3
oc logs <failed or errored datamover pod>
Copied!

Restore Failures

The most common reasons for restore failures are resource conflicts. Users may try to restore the backup to the same namespace where the original application and restored application have names conflict. Please check the logs to analyze the root cause of the issue.

Kubernetes

1
kubectl get pods -A | grep k8s-triliovault-control-plane
2
kubectl logs <controller POD>
3
kubectl logs <failed or errored metamoved pod>
Copied!

OpenShift

1
oc get pods -A | grep k8s-triliovault-control-plane
2
oc logs <controller POD>
3
oc logs <failed or errored metamoved pod>
Copied!

Webhooks are installed

Problem Description: If a TrilioVault operation is failing with an error Internal error occurred: failed or service "k8s-triliovault-webhook-service" not found
1
Error from server (InternalError): error when creating "targetmigration.yaml": Internal error occurred: failed
2
calling webhook "mapplication.kb.io": Post https://k8s-triliovault-webhook-service.default.svc:443/mutate?timeout=30s: service "k8s-triliovault-webhook-service" not found
Copied!
Solution : This might be due to the installation of TVK in multiple Namespaces. Even a simple clean-up of those installations will not have webhook instances cleared. Please run the following commands to get all the webhooks instances.
1
kubectl get validatingwebhookconfigurations -A | grep trilio
2
kubectl get mutatingwebhookconfigurations -A | grep trilio
Copied!
Delete duplicate entries if any.

Restore Failed

Problem Description: If you are not able to restore an application into different namespace (OpenShift)
1
Error:
2
Permission denied: could not bind to address 0.0.0.0:80
Copied!
Solution: User may not have sufficient permissions to restore the application into the namespace.
1
oc adm policy add-scc-to-user anyuid -z default -n <NameSpace>O
Copied!

OOMKilled Issues

Sometimes users may notice some TrilioVault pods go into OOMKilled state. This can happen in environments where a lot of objects exist for which additional processing power may be needed. In such scenarios, the memory and CPU limits can be bumped up by running the following commands:

Kubernetes

Please follow the API documentation to increase resource limits for TrilioVault pods deployed via the upstream Helm based Operator

OCP

Within OCP, the memory and CPU values for TrilioVault are managed and controlled by the Operator ClusterServiceVersion(CSV)Users can either edit the Operator CSV YAML directly from within the OCP console or can patch the pods based on the snippets provided below.
The following snippet shows how to increase the memory value to 2048Mi for the web-backend pod
1
export VERSION=2.6.0
2
export NAMESPACE=openshift-operators
3
4
oc patch csv k8s-triliovault-stable.$VERSION --type='json' -p='[{"op": "replace", "path": "/spec/install/spec/deployments/4/spec/template/spec/containers/0/resources/limits/memory", "value":"2048Mi"}]'
5
6
kubectl delete pod -l app=k8s-triliovault-web-backend -n $NAMESPACE
Copied!
The following snippet shows how to increase the CPU value to 800M for the web-backend pod
1
export VERSION=2.6.0
2
export NAMESPACE=openshift-operators
3
4
oc patch csv k8s-triliovault-stable.$VERSION --type='json' -p='[{"op": "replace", "path": "/spec/install/spec/deployments/4/spec/template/spec/containers/0/resources/limits/cpu", "value":"800M"}]'
5
6
kubectl delete pod -l app=k8s-triliovault-web-backend -n $NAMESPACE
Copied!

Uninstall TrilioVault

To uninstall TrilioVault and start fresh, delete all TrilioVault CRD's.
On Kubernetes
1
kubectl delete crds $(kubectl get crds | grep trilio | awk '{print $1}')
Copied!
On OpenShift
1
oc delete crds $(kubectl get crds | grep trilio | awk '{print $1}')
Copied!