Upgrading on TripleO Train [CentOS7]

1. Generic Pre-requisites

  1. Please ensure following points before starting the upgrade process:

    1. Either 4.1 GA OR any hotfix patch against 4.1 should be already deployed for performing upgrades mentioned in the current document.

    2. No snapshot OR restore to be running.

    3. Global job scheduler should be disabled.

    4. wlm-cron should be disabled (on primary T4O node).

      1. pcs resource disable wlm-cron

      2. Check : systemctl status wlm-cron OR pcs resource show wlm-cron

      3. Additional step : To ensure that cron is actually stopped, search for any lingering processes against wlm-cron and kill them. [Cmd : ps -ef | grep -i workloadmgr-cron]

2. [On Undercloud node] Clone triliovault repo and upload trilio puppet module

Run all the commands with 'stack' user

2.1 Clone the latest configuration scripts

cd /home/stack
mv triliovault-cfg-scripts triliovault-cfg-scripts-old
#Additionally keep the NFS share path noted
#/var/lib/nova/triliovault-mounts/MTcyLjMwLjEuMzovcmhvc3BuZnM=

##Clone latest Trilio cfg scripts repository
git clone --branch TVO/4.2.8 https://github.com/trilioData/triliovault-cfg-scripts.git
cd triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/

2.2 Backup target is “Ceph based S3” with SSL

If the backup target is Ceph S3 with SSL and SSL certificates are self-signed or authorized by private CA, then the user needs to provide a CA chain certificate to validate the SSL requests. For that, user needs to rename his CA chain cert file to 's3-cert.pem' and copy it into directory - 'triliovault-cfg-scripts/redhat-director-scripts/redhat-director-scripts/puppet/trilio/files'.

cp s3-cert.pem /home/stack/triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/puppet/trilio/files/

2.3 Upload triliovault puppet module

cd /home/stack/triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/scripts/
chmod +x *.sh
./upload_puppet_module.sh

## Output of above command looks like following.
Creating tarball...
Tarball created.
Creating heat environment file: /home/stack/.tripleo/environments/puppet-modules-url.yaml
Uploading file to swift: /tmp/puppet-modules-8Qjya2X/puppet-modules.tar.gz
+-----------------------+---------------------+----------------------------------+
| object                | container           | etag                             |
+-----------------------+---------------------+----------------------------------+
| puppet-modules.tar.gz | overcloud-artifacts | 368951f6a4d39cfe53b5781797b133ad |
+-----------------------+---------------------+----------------------------------+

## Above command creates following file.
ls -ll /home/stack/.tripleo/environments/puppet-modules-url.yaml

3. Prepare Trilio container images

In this step, we are going to pull triliovault container images to the user’s registry.

Trilio containers are pushed to ‘Dockerhub'. The registry URL is ‘docker.io’. Following are the triliovault container pull URLs.

Refer to the word <HOTFIX-TAG-VERSION> as 4.2.8 in the below sections

3.1 Trilio container URLs

Trilio container URLs for TripleO Train CentOS7:

Trilio Datamove container:        docker.io/trilio/tripleo-train-centos7-trilio-datamover:<HOTFIX-TAG-VERSION>-tripleo
Trilio Datamover Api Container:   docker.io/trilio/tripleo-train-centos7-trilio-datamover-api:<HOTFIX-TAG-VERSION>-tripleo
Trilio horizon plugin:            docker.io/trilio/tripleo-train-centos7-trilio-horizon-plugin:<HOTFIX-TAG-VERSION>-tripleo

There are two registry methods available in the TripleO Openstack Platform.

  1. Remote Registry

  2. Local Registry

Identify which method you are using. Below we have explained all three methods to pull and configure trilioVault's container images for overcloud deployment.

3.2 Remote Registry

If you are using the 'Remote Registry' method follow this section. You don't need to pull anything. You just need to populate the following container URLs in trilio env yaml.

  • Populate 'environments/trilio_env.yaml' file with triliovault container urls. Changes look like the following.

# For TripleO Train Centos7
$ grep 'Image' trilio_env.yaml
   DockerTrilioDatamoverImage: docker.io/trilio/tripleo-train-centos7-trilio-datamover:<HOTFIX-TAG-VERSION>-tripleo
   DockerTrilioDmApiImage: docker.io/trilio/tripleo-train-centos7-trilio-datamover-api:<HOTFIX-TAG-VERSION>-tripleo
   DockerHorizonImage: docker.io/trilio/tripleo-train-centos7-trilio-horizon-plugin:<HOTFIX-TAG-VERSION>-tripleo

3.3 Registry on Undercloud

If you are using 'local registry' on undercloud, follow this section.

  • Run the following script. Script pulls the triliovault containers and updates the triliovault environment file with URLs.

cd /home/stack/triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/scripts/

sudo ./prepare_trilio_images.sh <undercloud_registry_hostname_or_ip> <OS_platform> <4.2-TRIPLEO-CONTAINER> <container_tool_available_on_undercloud>

## To get undercloud registry hostname/ip, we have two approaches. Use either one.
1. openstack tripleo container image list

2. find your 'containers-prepare-parameter.yaml' (from overcloud deploy command) and search for 'push_destination'
cat /home/stack/containers-prepare-parameter.yaml | grep push_destination
 - push_destination: "undercloud.ctlplane.ooo.prod1:8787"

Here, 'undercloud.ctlplane.ooo.prod1' is undercloud registry hostname. Use it in our command like following example.

# Command Example:
sudo ./prepare_trilio_images.sh undercloud.ctlplane.ooo.prod1 centos7 <HOTFIX-TAG-VERSION>-tripleo podman

## Verify changes
# For TripleO Train Centos7
$ grep 'Image' ../environments/trilio_env.yaml
   DockerTrilioDatamoverImage: prod1-undercloud.demo:8787/trilio/tripleo-train-centos7-trilio-datamover:<HOTFIX-TAG-VERSION>-tripleo
   DockerTrilioDmApiImage: prod1-undercloud.demo:8787/trilio/tripleo-train-centos7-trilio-datamover-api:<HOTFIX-TAG-VERSION>-tripleo
   DockerHorizonImage: prod1-undercloud.demo:8787/trilio/tripleo-train-centos7-trilio-horizon-plugin:<HOTFIX-TAG-VERSION>-tripleo

## Above script pushes trilio container images to undercloud registry and sets correct trilio images URLs in ‘environments/trilio_env.yaml’. Verify the changes using the following command.

## For Centos7 Train

(undercloud) [stack@undercloud redhat-director-scripts/tripleo-train]$ openstack tripleo container image list | grep trilio
| docker://undercloud.ctlplane.localdomain:8787/trilio/tripleo-train-centos7-trilio-datamover:<HOTFIX-TAG-VERSION>-tripleo                       |                        |
| docker://undercloud.ctlplane.localdomain:8787/trilio/tripleo-train-centos7-trilio-datamover-api:<HOTFIX-TAG-VERSION>-tripleo                  |                   |
| docker://undercloud.ctlplane.localdomain:8787/trilio/tripleo-train-centos7-trilio-horizon-plugin:<HOTFIX-TAG-VERSION>-tripleo                  |

4. Configure multi-IP NFS

This section is only required when the multi-IP feature for NFS is required.

This feature allows to set the IP to access the NFS Volume per datamover instead of globally.

On Undercloud node, change directory

cd triliovault-cfg-scripts/common/

Edit file 'triliovault_nfs_map_input.yml' in the current directory and provide compute host and NFS share/ip map.

Get the compute hostnames from the following command. Check ‘Name' column. Use exact hostnames in 'triliovault_nfs_map_input.yml' file.

Run this command on undercloud by sourcing 'stackrc'.

(undercloud) [stack@ucqa161 ~]$ openstack server list
+--------------------------------------+-------------------------------+--------+----------------------+----------------+---------+
| ID                                   | Name                          | Status | Networks             | Image          | Flavor  |
+--------------------------------------+-------------------------------+--------+----------------------+----------------+---------+
| 8c3d04ae-fcdd-431c-afa6-9a50f3cb2c0d | overcloudtrain1-controller-2  | ACTIVE | ctlplane=172.30.5.18 | overcloud-full | control |
| 103dfd3e-d073-4123-9223-b8cf8c7398fe | overcloudtrain1-controller-0  | ACTIVE | ctlplane=172.30.5.11 | overcloud-full | control |
| a3541849-2e9b-4aa0-9fa9-91e7d24f0149 | overcloudtrain1-controller-1  | ACTIVE | ctlplane=172.30.5.25 | overcloud-full | control |
| 74a9f530-0c7b-49c4-9a1f-87e7eeda91c0 | overcloudtrain1-novacompute-0 | ACTIVE | ctlplane=172.30.5.30 | overcloud-full | compute |
| c1664ac3-7d9c-4a36-b375-0e4ee19e93e4 | overcloudtrain1-novacompute-1 | ACTIVE | ctlplane=172.30.5.15 | overcloud-full | compute |
+--------------------------------------+-------------------------------+--------+----------------------+----------------+---------+

Edit input map file and fill all the details. Refer to the this page for details about the structure.

vi triliovault_nfs_map_input.yml

Update pyyaml on the undercloud node only

If pip isn't available please install pip on the undercloud.

## On Python3 env 
sudo pip3 install PyYAML==5.1 

## On Python2 env 
sudo pip install PyYAML==5.1

Expand the map file to create a one-to-one mapping of the compute nodes and the nfs shares.

## On Python3 env 
python3 ./generate_nfs_map.py 
 
## On Python2 env 
python ./generate_nfs_map.py

The result will be in file - 'triliovault_nfs_map_output.yml'

Validate output map file

Open file 'triliovault_nfs_map_output.yml

vi triliovault_nfs_map_output.yml

available in the current directory and validate that all compute nodes are covered with all the necessary NFS shares.

Append this output map file to 'triliovault-cfg-scripts/redhat-director-scripts/<RHOSP_RELEASE_DIRECTORY>/environments/trilio_nfs_map.yaml'

grep ':.*:' triliovault_nfs_map_output.yml >> ../redhat-director-scripts/tripleo-train/environments/trilio_nfs_map.yaml

Validate the changes in file 'triliovault-cfg-scripts/redhat-director-scripts/<RHOSP_RELEASE_DIRECTORY>/environments/trilio_nfs_map.yaml'

Include the environment file (trilio_nfs_map.yaml) in overcloud deploy command with '-e' option as shown below.

-e /home/stack/triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/environments/trilio_nfs_map.yaml

Ensure that MultiIPNfsEnabled is set to true in trilio_env.yaml file and that nfs is used as backup target.

5. Fill in triliovault environment details

  • Refer old 'trilio_env.yaml' - (/home/stack/triliovault-cfg-scripts-old/redhat-director-scripts/tripleo-train/environments/trilio_env.yaml) file and update new trilio_env.yaml file.

  • Fill triliovault details in file - '/home/stack/triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/environments/trilio_env.yaml', triliovault environment file is self explanatory. Fill details of backup target, verify image urls and other details.

For Cohesity NFS format for NFS options : nolock,soft,timeo=600,intr,lookupcache=none,nfsvers=3,retrans=10

6. roles_data.yaml file changes

A new separate triliovault service is introduced in Trilio 4.2 release for Trilio Horizon plugin. User need to add following service to roles_data.yaml file and this service need to be co-located with openstack horizon service.

OS::TripleO::Services::TrilioHorizon

7. Install Trilio on Overcloud

Use the following heat environment file and roles data file in overcloud deploy command

  1. trilio_env.yaml: This environment file contains Trilio backup target details and Trilio container image locations

  2. roles_data.yaml: This file contains overcloud roles data with Trilio roles added. This file need not be changed, you can use the old role_data.yaml file

  3. Use the correct trilio endpoint map file as per your keystone endpoint configuration. - Instead of tls-endpoints-public-dns.yaml this file, use ‘environments/trilio_env_tls_endpoints_public_dns.yaml’ - Instead of tls-endpoints-public-ip.yaml this file, use ‘environments/trilio_env_tls_endpoints_public_ip.yaml’ - Instead of tls-everywhere-endpoints-dns.yaml this file, use ‘environments/trilio_env_tls_everywhere_dns.yaml’

Deploy command with triliovault environment file looks like following.

openstack overcloud deploy --templates \
  -e /home/stack/templates/node-info.yaml \
  -e /home/stack/templates/overcloud_images.yaml \
  -e /home/stack/triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/environments/trilio_env.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-tls.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/inject-trust-anchor.yaml \
  -e /home/stack/triliovault-cfg-scripts/redhat-director-scripts/tripleo-train/environments/trilio_env_tls_endpoints_public_dns.yaml \
  --ntp-server 192.168.1.34 \
  --libvirt-type qemu \
  --log-file overcloud_deploy.log \
  -r /home/stack/templates/roles_data.yaml

8. Steps to verify correct deployment

8.1 On overcloud controller node(s)

Make sure Trilio dmapi and horizon containers(shown below) are in a running state and no other Trilio container is deployed on controller nodes. If the containers are in restarting state or not listed by the following command then your deployment is not done correctly. You need to revisit the above steps

8.2 On overcloud compute node(s)

Make sure the Trilio datamover container (shown below) is in a running state and no other Trilio container is deployed on compute nodes. If the containers are in restarting state or not listed by the following command then your deployment is not done correctly. You need to revisit the above steps.

[root@overcloud-novacompute-0 heat-admin]# podman ps | grep trilio
b1840444cc59  prod1-compute1.demo:8787/trilio/tripleo-train-centos7-trilio-datamover:<HOTFIX-TAG-VERSION>-tripleo         kolla_start           5 days ago  Up 5 days ago         trilio_datamover

8.3 On OpenStack node where OpenStack Horizon Service is running

Make sure the horizon container is in a running state. Please note that the 'Horizon' container is replaced with the Trilio Horizon container. This container will have the latest OpenStack horizon + Trilio's horizon plugin

[root@overcloud-controller-0 heat-admin]# podman ps | grep horizon
094971d0f5a9  prod1-controller1.demo:8787/trilio/tripleo-train-centos7-trilio-horizon-plugin:<HOTFIX-TAG-VERSION>-tripleo      kolla_start           5 days ago  Up 5 days ago         horizon

9. Troubleshooting if any failures

Trilio components will be deployed using puppet scripts.

In case of the overcloud deployment failing do the following command provide the list of errors. The following document also provides valuable insights: https://docs.openstack.org/tripleo-docs/latest/install/troubleshooting/troubleshooting-overcloud.html

openstack stack failures list overcloud
heat stack-list --show-nested -f "status=FAILED"
heat resource-list --nested-depth 5 overcloud | grep FAILED

=> If trilio datamover api containers does not start well or in restarting state, use following logs to debug.


docker logs trilio_dmapi

tailf /var/log/containers/trilio-datamover-api/dmapi.log

 

=> If trilio datamover containers does not start well or in restarting state, use following logs to debug.


docker logs trilio_datamover

tailf /var/log/containers/trilio-datamover/tvault-contego.log

10. Known Issues/Limitations

10.1 Overcloud deploy fails with the following error. Valid for Train CentOS7 only.

This is not a Trilio issue. It’s a TripleO issue that is not fixed in Train Centos7. This is fixed in higher versions of TripleO.

Workaround:

APPLY the fix directly on the setup. It's not merged in train centos7. PR: https://github.com/saltedsignal/puppet-certmonger/pull/35/files Need fix in on controller and compute nodes /usr/share/openstack-puppet/modules/certmonger/lib/puppet/provider/certmonger_certificate/certmonger_certificate.rb

11. Enable mount-bind for NFS

Note : Below mentioned steps required only if target backend is NFS.

Please refer to this page for detailed steps to setup mount bind.

Last updated