After the installation and configuration of Trilio for Openstack did succeed the following steps can be done to verify that the Trilio installation is healthy.
On the Controller node
Make sure below containers are in a running state, triliovault-wlm-cron would be running on only one of the controllers in case of multi-controller setup.
triliovault_datamover_api
triliovault_wlm_api
triliovault_wlm_scheduler
triliovault_wlm_workloads
triliovault-wlm-cron
If the containers are in a restarting state or not listed by the following command then your deployment is not done correctly. Please recheck if you followed the complete documentation.
[root@overcloudtrain5-controller-0 /]# podman ps | grep trilio-
76511a257278 undercloudqa162.ctlplane.trilio.local:8787/trilio/trilio-horizon-plugin:<CONTAINER-TAG-VERSION>-rhosp16.2 kolla_start 12 days ago Up 12 days ago horizon
5c5acec33392 cluster.common.tag/trilio-wlm:pcmklatest /bin/bash /usr/lo... 7 days ago Up 7 days ago triliovault-wlm-cron-podman-0
8dc61a674a7f undercloudqa162.ctlplane.trilio.local:8787/trilio/trilio-datamover-api:<CONTAINER-TAG-VERSION>-rhosp16.2 kolla_start 7 days ago Up 7 days ago triliovault_datamover_api
a945fbf80554 undercloudqa162.ctlplane.trilio.local:8787/trilio/trilio-wlm:<CONTAINER-TAG-VERSION>-rhosp16.2 kolla_start 7 days ago Up 7 days ago triliovault_wlm_scheduler
402c9fdb3647 undercloudqa162.ctlplane.trilio.local:8787/trilio/trilio-wlm:<CONTAINER-TAG-VERSION>-rhosp16.2 kolla_start 7 days ago Up 6 days ago triliovault_wlm_workloads
f9452e4b3d14 undercloudqa162.ctlplane.trilio.local:8787/trilio/trilio-wlm:<CONTAINER-TAG-VERSION>-rhosp16.2 kolla_start 7 days ago Up 6 days ago triliovault_wlm_api
After successful deployment, triliovault-wlm-cron service would get added in pcs cluster as a cluster resource, you can verify through pcs status command
[root@overcloudtrain5-controller-0 /]# pcs status
Cluster name: tripleo_cluster
Cluster Summary:
* Stack: corosync
* Current DC: overcloudtrain5-controller-0 (version 2.0.5-9.el8_4.3-ba59be7122) - partition with quorum
* Last updated: Mon Jul 24 11:19:05 2023
* Last change: Mon Jul 17 10:38:45 2023 by root via cibadmin on overcloudtrain5-controller-0
* 4 nodes configured
* 14 resource instances configured
Node List:
* Online: [ overcloudtrain5-controller-0 ]
* GuestOnline: [ galera-bundle-0@overcloudtrain5-controller-0 rabbitmq-bundle-0@overcloudtrain5-controller-0 redis-bundle-0@overcloudtrain5-controller-0 ]
Full List of Resources:
* ip-172.30.6.27 (ocf::heartbeat:IPaddr2): Started overcloudtrain5-controller-0
* ip-172.30.6.16 (ocf::heartbeat:IPaddr2): Started overcloudtrain5-controller-0
* Container bundle: haproxy-bundle [cluster.common.tag/openstack-haproxy:pcmklatest]:
* haproxy-bundle-podman-0 (ocf::heartbeat:podman): Started overcloudtrain5-controller-0
* Container bundle: galera-bundle [cluster.common.tag/openstack-mariadb:pcmklatest]:
* galera-bundle-0 (ocf::heartbeat:galera): Master overcloudtrain5-controller-0
* Container bundle: rabbitmq-bundle [cluster.common.tag/openstack-rabbitmq:pcmklatest]:
* rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloudtrain5-controller-0
* Container bundle: redis-bundle [cluster.common.tag/openstack-redis:pcmklatest]:
* redis-bundle-0 (ocf::heartbeat:redis): Master overcloudtrain5-controller-0
* Container bundle: openstack-cinder-volume [cluster.common.tag/openstack-cinder-volume:pcmklatest]:
* openstack-cinder-volume-podman-0 (ocf::heartbeat:podman): Started overcloudtrain5-controller-0
* Container bundle: triliovault-wlm-cron [cluster.common.tag/trilio-wlm:pcmklatest]:
* triliovault-wlm-cron-podman-0 (ocf::heartbeat:podman): Started overcloudtrain5-controller-0
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
Make sure the Trilio Datamover container is in running state and no other Trilio container is deployed on compute nodes.
[root@overcloudtrain5-novacompute-0 heat-admin]# podman ps | grep -i datamover
c750a8d0471f undercloudqa162.ctlplane.trilio.local:8787/trilio/trilio-datamover:<CONTAINER-TAG-VERSION>-rhosp16.2 kolla_start 7 days ago Up 7 days ago triliovault_datamover
Check if provided backup target is mounted well on Compute host.
Make sure the horizon container is in a running state. Please note that the Horizon container is replaced with Trilio's Horizon container. This container will have the latest OpenStack horizon + Trilio horizon plugin.
[root@overcloudtrain5-controller-0 heat-admin]# podman ps | grep horizon
76511a257278 undercloudqa162.ctlplane.trilio.local:8787/trilio/trilio-horizon-plugin:<CONTAINER-TAG-VERSION>-rhosp16.2 kolla_start 12 days ago Up 12 days ago horizon
If the Trilio Horizon container is in the restarted state on RHOSP 16.1.8/RHSOP 16.2.4 then use the below workaround
## Either of the below workarounds should be performed on all the controller nodes where issue occurs for horizon pod.
option-1: Restart the memcached service on controller using systemctl (command: systemctl restart tripleo_memcached.service)
option-2: Restart the memcached pod (command: podman restart memcached)