Upgrading on Ansible OpenStack

Upgrading from Trilio 4.1 to a higher version

Trilio 4.1 can be upgraded without reinstallation to a higher version of T4O if available.

Pre-requisites

Please ensure the following points are met before starting the upgrade process:

  • No Snapshot or Restore is running

  • The Global-Job-Scheduler is disabled

  • wlm-cron is disabled on the Trilio Appliance

  • Access to the Gemfury repository to fetch new packages \

    1. Note: For single IP-based NFS share as a backup target refer to this rolling upgrade on the Ansible Openstack document. User needs to follow the Ansible Openstack installation document if having multiple IP-based NFS share

Deactivating the wlm-cron service

The following sets of commands will disable the wlm-cron service and verify that it has been completely shut down.

[root@TVM2 ~]# pcs resource disable wlm-cron
[root@TVM2 ~]# systemctl status wlm-cron
● wlm-cron.service - workload's scheduler cron service
   Loaded: loaded (/etc/systemd/system/wlm-cron.service; disabled; vendor preset                                                                                                                                                                                                                                                                                                                                                           : disabled)
   Active: inactive (dead)

Jun 11 08:27:06 TVM2 workloadmgr-cron[11115]: 11-06-2021 08:27:06 - INFO - 1...t
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: 140686268624368 Child 11389 ki...5
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: 11-06-2021 08:27:07 - INFO - 1...5
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: Shutting down thread pool
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: 11-06-2021 08:27:07 - INFO - S...l
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: Stopping the threads
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: 11-06-2021 08:27:07 - INFO - S...s
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: All threads are stopped succes...y
Jun 11 08:27:07 TVM2 workloadmgr-cron[11115]: 11-06-2021 08:27:07 - INFO - A...y
Jun 11 08:27:09 TVM2 systemd[1]: Stopped workload's scheduler cron service.
Hint: Some lines were ellipsized, use -l to show in full.
[root@TVM2 ~]# pcs resource show wlm-cron
 Resource: wlm-cron (class=systemd type=wlm-cron)
  Meta Attrs: target-role=Stopped
  Operations: monitor interval=30s on-fail=restart timeout=300s (wlm-cron-monito                                                                                                                                                                                                                                                                                                                                                           r-interval-30s)
              start interval=0s on-fail=restart timeout=300s (wlm-cron-start-int                                                                                                                                                                                                                                                                                                                                                           erval-0s)
              stop interval=0s timeout=300s (wlm-cron-stop-interval-0s)
[root@TVM2 ~]# ps -ef | grep -i workloadmgr-cron
root     15379 14383  0 08:27 pts/0    00:00:00 grep --color=auto -i workloadmgr                                                                                                                                                                                                                                                                                                                                                           -cron

Update the repositories

Deb-based (Ubuntu)

Add the Gemfury repository on each of the DMAPI containers, Horizon containers & Compute nodes.

Create a file /etc/apt/sources.list.d/fury.list and add the below line to it.

deb [trusted=yes] https://apt.fury.io/triliodata-4-2/ /

The following commands can be used to verify the connection to the Gemfury repository and to check for available packages.

apt-get update
apt list --upgradable

RPM-based (CentOS)

Add Trilio repo on each of the DMAPI containers, Horizon containers & Compute nodes.

Modify the file /etc/yum.repos.d/trilio.repo and add the below line in it.

[triliovault-4-2]
name=triliovault-4-2
baseurl=http://trilio:XpmkpMFviqSe@repos.trilio.io:8283/triliodata-4-2/yum/
gpgcheck=0
enabled=1

The following commands can be used to verify the connection to the Gemfury repository and to check for available packages.

yum repolist
yum check-upgrade

Upgrade tvault-datamover-api package

The following steps represent the best practice procedure to upgrade the DMAPI service.

  1. Login to DMAPI container

  2. Take a backup of the DMAPI configuration in /etc/dmapi/

  3. use apt list --upgradeable to identify the package used for the dmapi service

  4. Update the DMAPI package

  5. restore the backed-up config files into /etc/dmapi/

  6. Restart the DMAPI service

  7. Check the status of the DMAPI service

These steps are done with the following commands. This example is assuming that the more common python3 packages are used.

Deb-based (Ubuntu)

lxc-ls #grep container name for dmapi service
lxc-attach -n <dmapi container name>
tar -czvf dmapi_config.tar.gz /etc/dmapi
apt list --upgradable
apt install python3-dmapi --upgrade
tar -xzvf dmapi_config.tar.gz -C /
systemctl restart tvault-datamover-api
systemctl status tvault-datamover-api

RPM-based (CentOS)

lxc-ls #grep container name for dmapi service
lxc-attach -n <dmapi container name>
tar -czvf dmapi_config.tar.gz /etc/dmapi
yum list installed | grep dmapi  ##use dnf if yum not available
yum check-update python3-dmapi   ##use dnf if yum not available
yum upgrade python3-dmapi        ##use dnf if yum not available
tar -xzvf dmapi_config.tar.gz -C /
systemctl restart tvault-datamover-api
systemctl status tvault-datamover-api

Upgrade Horizon plugin

The following steps represent the best practice procedure to update the Horizon plugin.

  1. Login to Horizon Container

  2. use apt list --upgradeable to identify the package the Trilio packages for the workloadmgrclient, contegoclient and tvault-horizon-plugin

  3. Install the tvault-horizon-plugin package in the required python version

  4. Install the workloadmgrclient package

  5. Install the contegoclient

  6. Restart the Horizon webserver

  7. Check the installed version of the workloadmgrclient

These steps are done with the following commands. This example is assuming that the more common python3 packages are used.

Deb-based (Ubuntu)

lxc-attach -n controller_horizon_container-ead7cc60
apt list --upgradable
apt install python3-tvault-horizon-plugin --upgrade
apt install python3-workloadmgrclient --upgrade
apt install python3-contegoclient --upgrade
systemctl restart apache2
workloadmgr --version

RPM-based (CentOS)

lxc-attach -n controller_horizon_container-ead7cc60
yum list installed | grep trilio ##use dnf if yum not available
yum upgrade python3-contegoclient-el8 python3-tvault-horizon-plugin-el8 python3-workloadmgrclient-el8  ##use dnf if yum not available
systemctl restart httpd
workloadmgr --version

Upgrade the tvault-contego service

The following steps represent the best practice procedure to update the tvault-contego service on the compute nodes.

  1. Login into the compute node

  2. Take a backup of the config files at and /etc/tvault-contego/ and /etc/tvault-object-store (if S3)

  3. Unmount storage mount path

  4. Upgrade the tvault-contego package in the required python version

  5. (S3 only) upgrade the s3-fuse-plugin package

  6. Restore the config files

  7. (S3 only) Restart the tvault-object-store service

  8. Restart the tvault-contego service

  9. Check the status of the service(s)

These steps are done with the following commands. This example is assuming that the more common python3 packages are used.

NFS as Storage Backend

  • Take a backup of the config files

tar -czvf contego_config.tar.gz /etc/tvault-contego/ 
  • Check the mount path of the NFS storage using the command df -h and unmount the path using umount command. e.g.

    [root@compute ~]# df -h
    df: /var/trilio/triliovault-mounts: Transport endpoint is not connected
    Filesystem                     Size  Used Avail Use% Mounted on
    devtmpfs                        28G     0   28G   0% /dev
    tmpfs                           28G     0   28G   0% /dev/shm
    tmpfs                           28G  928K   28G   1% /run
    tmpfs                           28G     0   28G   0% /sys/fs/cgroup
    /dev/mapper/cl_centos8-root     70G   13G   58G  19% /
    /dev/vda1                     1014M  231M  784M  23% /boot
    tmpfs                          5.5G     0  5.5G   0% /run/user/0
    172.25.0.10:/mnt/tvault/42436  2.5T  1.8T  674G  74% /var/triliovault-mounts/MTcyLjI1LjAuMTA6L21udC90dmF1bHQvNDI0MzY=
    [root@compute ~]# umount /var/triliovault-mounts/MTcyLjI1LjAuMTA6L21udC90dmF1bHQvNDI0MzY=
    
  • Upgrade the Trilio packages:

Deb-based (Ubuntu):

apt install python3-tvault-contego --upgrade
apt install python3-s3-fuse-plugin --upgrade

RPM-based (CentOS):

yum upgrade python3-tvault-contego   ##use dnf if yum not available
yum upgrade python3-s3fuse-plugin   ##use dnf if yum not available
  • Restore the config files, restart the service and verify the mount point

tar -xzvf  contego_config.tar.gz -C /
systemctl restart tvault-contego
systemctl status tvault-contego
#To check if backend storage got mounted successfully
df -h

S3 as Storage Backend

  • Take a backup of the config files

tar -czvf contego_config.tar.gz /etc/tvault-contego/ /etc/tvault-object-store/
e.g. 
  • Check the mount path of the S3 storage using the command df -h and unmount the path using umount command.

root@compute:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             28G     0   28G   0% /dev
tmpfs           5.5G  1.4M  5.5G   1% /run
/dev/vda3       124G   16G  102G  13% /
tmpfs            28G   20K   28G   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            28G     0   28G   0% /sys/fs/cgroup
/dev/vda1       456M  297M  126M  71% /boot
tmpfs           6.3G     0  6.3G   0% /var/triliovault/tmpfs
Trilio        -     -  0.0K    - /var/triliovault-mounts
tmpfs           5.5G     0  5.5G   0% /run/user/0
[root@compute ~]# umount /var/triliovault-mounts
  • Upgrade the Trilio packages

Deb-based (Ubuntu):

apt install python3-tvault-contego --upgrade
apt install python3-s3-fuse-plugin --upgrade

RPM-based (CentOS):

yum upgrade python3-tvault-contego   ##use dnf if yum not available
yum upgrade python3-s3fuse-plugin   ##use dnf if yum not available
  • Restore the config files, restart the service and verify the mount point

tar -xzvf  contego_config.tar.gz -C /
systemctl restart tvault-contego
systemctl status tvault-contego
#To check if backend storage got mounted successfully
df -h

Advance settings/configuration

Customize HAproxy cfg parameters for Trilio datamover api service

Following are the haproxy cfg parameters recommended for optimal performance of dmapi service. File location on controller /etc/haproxy/haproxy.cfg

retries 5
timeout http-request 10m
timeout queue 10m
timeout connect 10m
timeout client 10m
timeout server 10m
timeout check 10m
balance roundrobin
maxconn 50000

If values were already updated during any of the previous releases, further steps can be skipped.

Parameters timeout client, timeout server, and balance for DMAPI service

Remove the below content, if present in the file/etc/openstack_deploy/user_variables.ymlon the ansible host.

haproxy_extra_services:
  - service:
      haproxy_service_name: datamover_service
      haproxy_backend_nodes: "{{ groups['dmapi_all'] | default([]) }}"
      haproxy_ssl: "{{ haproxy_ssl }}"
      haproxy_port: 8784
      haproxy_balance_type: http
      haproxy_backend_options:
        - "httpchk GET / HTTP/1.0\\r\\nUser-agent:\\ osa-haproxy-healthcheck"

Add the below lines at end of the file /etc/openstack_deploy/user_variables.yml on the ansible host.

# Datamover haproxy setting
haproxy_extra_services:
  - service:
      haproxy_service_name: datamover_service
      haproxy_backend_nodes: "{{ groups['dmapi_all'] | default([]) }}"
      haproxy_ssl: "{{ haproxy_ssl }}"
      haproxy_port: 8784
      haproxy_balance_type: http
      haproxy_balance_alg: roundrobin
      haproxy_timeout_client: 10m
      haproxy_timeout_server: 10m
      haproxy_backend_options:
        - "httpchk GET / HTTP/1.0\\r\\nUser-agent:\\ osa-haproxy-healthcheck"

Update Haproxy configuration using the below command on the ansible host.

cd /opt/openstack-ansible/playbooks
openstack-ansible haproxy-install.yml

Enable mount-bind for NFS

T4O 4.2 has changed the calculation of the mount point. It is necessary to set up the mount-bind to make T4O 4.1 or older backups available for T4O 4.2

Please follow this documentation for detailed steps to set up mount bind.

Last updated