Trilio 4.1 HF1 Release Notes
Release Versions
Packages
s3fuse
python package
4.1.94.3
tvault-configurator
python package
4.1.94.3
workloadmgr
python package
4.1.94.3
workloadmgrclient
python package
4.1.94
dmapi
deb package
4.1.94.3
python3-dmapi
deb package
4.1.94.3
tvault-contego
deb package
4.1.94.3
python3-tvault-contego
deb package
4.1.94.3
tvault-horizon-plugin
deb package
4.1.94.3
python3-tvault-horizon-plugin
deb package
4.1.94.3
s3-fuse-plugin
deb package
4.1.94.3
python3-s3-fuse-plugin
deb package
4.1.94.3
workloadmgr
deb package
4.1.94.3
workloadmgrclient
deb package
4.1.94
dmapi
rpm package
4.1.94.3-4.1
python3-dmapi
rpm package
4.1.94.3-4.1
tvault-contego
rpm package
4.1.94.3-4.1
python3-tvault-contego
rpm package
4.1.94.3-4.1
tvault-horizon-plugin
rpm package
4.1.94.3-4.1
python3-tvault-horizon plugin-el8
rpm package
4.1.94.3-4.1
python-s3fuse-plugin-cent7
rpm package
4.1.94.3-4.1
python3-s3fuse-plugin
rpm package
4.1.94.3-4.1
workloadmgrclient
rpm package
4.1.94
Containers and Gitbranch
Gitbranch
hotfix-1-TVO/4.1
RHOSP13 containers
4.1.94-hotfix2-rhosp13
RHOSP16.0 containers
4.1.94-hotfix-2-rhosp16
RHOSP16.1 containers
4.1.94-hotfix-2-rhosp16.1
Kolla Ansible Ussuri containers
4.1.94-hotfix-2-ussuri
Changelog
Enhancements
Support for passwordless SMTP servers
Trilio for Openstack supports sending notification emails upon succeeded or failed backup/restore jobs. The required SMTP server configuration enforced the usage of a password for the SMTP user.
A password is no longer necessary when the SMTP server doesn’t need it.
Support for Multi-Attach Volumes
Cinder supports a Volume Type, which allows attaching the same Volume to multiple instances simultaneously. Backups and Restore for this Volume Type failed. Trilio for Openstack is now providing base support for this Volume Type.
Cinder Boot Volumes with Multi-Attach activated are not yet supported.
This only allows the backup and restoration of Multi-Attach Volumes. Trilio will handle the Volume like any single attached Volume for now. For example, a multi-attach volume connected to 2 VMs will get backed up and restored twice.
Increased the timeout for data transfer
Trilio for Openstack is tracking the progress of backups and restore using a tracking file. If this file is not getting updated within a defined timeframe, the Trilio data transfer fails. This timeframe got extended from 10 minutes to 20 minutes.
This value will become configurable in T4O 4.1 SP1
Misleading and expected errors are no longer shown in logs by default
Trilio for Openstack was logging many system error messages, which were actually expected and handled internally without impacting the actual functions of the solution.
These error messages were misleading in the normal troubleshooting process. These error messages aren’t logged anymore by default. They can be reactivated using the debug mode for logging.
Image upload timeout window has been increased and made configurable
The upload of Trilio backups is limited in time to prevent stalling workloads with a stuck upload process.
This timeout window has been increased from 10 hours to 48h by default and is configurable in the workloadmgr config file.
Restart wlm-workloads after setting this value.
Grace time to retrigger a Snapshot in case of deactivated Global Job Scheduler
When the Global Job Scheduler is deactivated, no backups are triggered. The Global Job Scheduler contains a grace time for missed Snapshots. All Snapshots that were supposed to be triggered within this grace time before activation of the Global Job Scheduler are retriggered.
This grace period is now configurable in the workloadmgr config file.
After setting this value, restart the wlm-cron service.
Increased the amount of dmapi workers and made it configurable
In highly used environments, the dmapi worker got identified as a potential bottleneck. The amount of default workers used by a dmapi service has been increased to 16 and is configurable in the dmapi config file.
Restart the dmapi service after setting the configuration manually.
The upgrade process of RHOSP and Kolla Ansible will automatically set this value.
Added new settings in haproxy configuration for dmapi
Response times in highly used environments might be slow, leading to the dmapi service timing out in the haproxy connection. The default values of haproxy are not always suitable in that case.
The haproxy configuration for the dmapi service has been extended to the following values.
Restart the haproxy service after setting the configuration manually.
The upgrade process of RHOSP and Kolla Ansible will automatically set these values.
Added retries for Neutron API calls
The Openstack Neutron service is highly used up to the point that API calls are timing out. Trilio backups and restores failed when any Neutron API call timed out.
Trilio will now retry Neutron API calls three times before failing a backup or restore.
Added retries and rescan for mounting temporary Volumes using Cinder Storages with multipathing activated
It was observed in multipathing environments that sometimes backups failed due to errors with the temporary Cinder Volumes during the following actions:
Create Cinder Volume out of Cinder Snapshot
Mount Cinder Volume to Compute Node
Unmount Cinder Volume from Compute Node
Delete Cinder Volume
During these operations in multipath environments, errors are now handled by rescanning the connected devices and retrying the internal commands.
The amount of retries is configurable in the tvault-contego config file.
Restart the tvault-contego service after manually setting the value.
The upgrade process of RHOSP and Kolla Ansible will automatically set these values.
Enhanced logging of Trilio for Openstack GUI
The Trilio for Openstack GUI uses the admin account to secure access to the features and functionalities located on the Trilio appliance. The following events are now getting logged by the Trilio Appliance:
Login attempts
Logout events
Password changes for the admin user
Added text banner to Trilio for Openstack GUI login page
The Trilio for Openstack login page can now is extendable to contain a text banner. This text banner is configurable on the Trilio appliance by editing the banner yaml file located under:
/etc/tvault-config/banner.yaml
The content of the file looks as follows:
Restart tvault-config after changing the banner to activate it.
Fixed Bugs and issues
Local job scheduler stays disabled after workload creation with enabled job scheduler
An issue got fixed for rare occasions in which the status of the local job scheduler of a single workload was disabled, despite the workload created with an enabled job scheduler.
Global Job Scheduler is shown as active even when wlm-cron service is down
An issue got fixed for the Global Job Scheduler returning enabled or disabled even when the wlm- cron service is deactivated. The status returned in this scenario is now an error message showing the wlm-cron service status.
Wrong documentation link in the Trilio Appliance
The documentation link available inside the Trilio Appliance was still pointing to the old outdated documentation webpage. The link has been updated to point towards the correct documentation.
Restore VMs into different Availability Zone failed
An issue got fixed, which prevented the restoration of VMs into different Availability Zones in the case of the original Availability Zone no longer being available.
Stopping workloadmgr service and all ongoing worker tasks
An issue got fixed, which lead to stall service jobs being left behind upon restart of workloadmgr services.
Mounting of LVM configured disks into the FRM failed
An issue got fixed, which prevented the correct mounting and access of Volumes partitioned and configured by LVM.
Upload of Backups failed intermittently using S3
An issue got fixed, which lead to a race condition between upload threads in the S3 fuse plugin, which led to backups failing during the upload phase.
Multipathing not enabled by default in the Data-Mover container used in RHOSP and Kolla Ansible Openstack
An issue got fixed, which lead to multipathing not being enabled in the Data-Mover container used by RHOSP and Kolla Ansible.
Upgrading to 4.1 HF1 will automatically activate multipathing where feasible.
An inaccessible S3 mount point got created when the S3 endpoint is not available during deployment or configuration
An issue got fixed, which lead to the creation of a Trilio mount point, even when the provided S3 backup target is not reachable during deployment or configuration.
The deployment will still succeed, but the tvault-object-store service will be in a failed state.
Trustee role not correctly inherited from user groups to users
An issue got fixed, which prevented the detection of the Trilio trustee role for a user, who had this role inherited from a user group.
The configurator always trying to use Keystone internal endpoint
An issue got fixed, during which the chosen endpoint type did not get honored for Keystone and the configurator always reached out to the Keystone internal endpoint.
The following value can be set in the api-paste ini file located under:
/etc/workloadmgr/api-paste.ini
Afterward the wlm-workloads service needs to be restarted.
It is recommended to reconfigure the appliance to activate the fix
Disk integrity check failing with a false positive
An issue got identified, which leads to the disk integrity check failing, although there is no data loss.
Snapshots with a failed disk integrity check are currently no longer failing and instead show a warning in the log files about the failed disk integrity check.
A complete fix of the root cause is planned for 4.1 SP1.
Workload creation and Backup creation failed due to Latin characters like á
An issue got identified in which Latin characters like á did lead to a Workload not being created or a backup not succeeding.
This hotfix implements support of Latin characters for the following:
Calendar shown and used during workload creation for the job scheduler
Name and description of security groups
Full support for Latin characters comes in 4.1 SP1
Backups for multipath environments using the FC protocol failed
An issue got fixed, which prevented successful backups in environments using multipathing with the FC storage protocol.
Last updated