2024 Series Release Notes
ks2024.4-rc1
New Features
Enables OOM Score for all docker containers and services inside that we deploy. Default -500 can be changed via docker_oom_score_adj global variable.
Adds AppArmor profile for Prometheus node exporter container when deployed to Ubuntu with AppArmor enabled and when
enable_prometheus_node_exporter
set to “yes”.
Known Issues
Podman/systemd are not affected by this commit. WIP.
Bug Fixes
Fixes connection to dbus socket from Prometheus node exporter container when enabled systemd collector.
ks2024.3
New Features
Adds support for deploying the Prometheus Consul exporter as part of the prometheus monitoring exporters stack.
Deploys and configures a prometheus-multipath-exporter image as part of the Prometheus monitoring stack based on ‘multipath_exporter <https://github.com/newrushbolt/multipath-exporter/>’.
To deploy use new parameter
enable_prometheus-multipath-exporter
(default: “no”) and new groupprometheus-multipath-exporter
should be added to inventory.
Adds template and job for blackbox exporter to monitoring http connectivity to nova-api endpoint from all instances of blackbox exporter. You need to add group “prometheus-blackbox-exporter” to all host groups in inventory from which you want to check connectivity. For example: [prometheus-blackbox-exporter:children] monitoring compute
Bug Fixes
Fixes an issue during fact gathering when using the
--limit
argument where a host that fails to gather facts could cause another host to fail during delegated fact gathering.
Fixes 2067036. Added
octavia_interface_wait_timeout
to control octavia-interface.service timeout to be able wait openvswitch agent sync has been finished and octavia-lb-net is reachable from the host. Also set restart policy for this unit to on-failure LP#2067036
ks2024.3-rc1
Bug Fixes
Fixes parsing of JSON output of inner modules called by
kolla-toolbox
when data was returned on standard error. LP#2080544
16.7.0
New Features
Modifies public API firewalld rules to be applied immediately to a running firewalld service. This requires firewalld to be running, but avoids reloading firewalld, which is disruptive due to the way in which firewalld builds its firewall chains.
Added a command to upgrade to a target version of RabbitMQ. This is required before a SLURP upgrade. See the docs for more details: https://docs.openstack.org/kolla-ansible/latest/reference/message-queues/rabbitmq.html#slurp
Bug Fixes
Fixes handling of openvswitch on
manila-share
nodes. LP#1993285
Fixes behaviour of Change Password screen in Horizon until bug #2073639 is resolved. LP#2073159
Fixes the Python requests library issue when using custom CA by adding the REQUESTS_CA environment variable to the kolla-toolbox container. See LP#1967132
Fixes configuration of CloudKitty when internal TLS is enabled. LP#1998831
Fixes the dimensions comparison when we set values like 1g in the container dimensions configuration, making the docker container getting restarted even with no changes, as we are comparing 1g with 1073741824, which is displayed in the docker inspect while 1g is in the configuration.
Fixes the detection of the Nova Compute Ironic service when a custom host option is set in the service config file. See LP#2056571
Removes the default /tmp/ mountpoint from the horizon container. This change is made to harden the container and prevent potential security issues. For more information, see the Bug Report: LP#2068126.
Fixes an issue where OVN northbound or southbound database deployment could fail when a new leader is elected. LP#2059124
ks2024.2.5
Upgrade Notes
Change variable grafana_security_ldap_group_dn from single role to multiple configuraion. This change is not reversable for configuration ldap.
Add ldap configuration for opensearch in templates in tasks. Examples configurating ldap in globals.yml for opensearch: enable_opensearch_ldap: “yes” # enable ldap opensearch_internal_auth: “yes” # enable internal users opensearch_admin_tls: “CN=backend.vm.lab.itkey.com” # tls fqdn in certificate opensearch_ldap_hosts: # list servers ldap or AD
“win19.test.keystack.com:389”
opensearch_ldap_bind_dn: “CN=adadmin,CN=Users,DC=test,DC=keystack,DC=com” # dn user for connection ldap opensearch_ldap_password: “P@ssw0rd” opensearch_ldap_userbase: “CN=Users,DC=test,DC=keystack,DC=com” opensearch_ldap_rolebase: “CN=Builtin,DC=test,DC=keystack,DC=com” opensearch_ldap_verify_hostnames: “true” opensearch_ldap_enable_ssl_client_auth: “false” opensearch_ldap_enable_start_tls: “false” opensearch_ldap_enable_ssl: “false” fluentd_opensearch_password: “P@ssw0rd” fluentd_opensearch_user: “adadmin”
Bug Fixes
Fixes an deploy opensearch with enable TLS on the internal VIP.
Updates the default Grafana OpenSearch datasource configuration to use values for OpenSearch that work out of the box. Replaces the Elasticsearch values that were previously being used. The new configuration can be applied by deleting your datasource and reconfiguring Grafana through kolla ansible. In order to prevent dashboards from breaking when the datasource is deleted, one should use datasource variables in Grafana. See bug 2039500.
16.6.0
Upgrade Notes
MariaDB backup now uses the same image as the running MariaDB server. The following variables relating to MariaDB backups are no longer used and have been removed:
mariabackup_image
mariabackup_tag
mariabackup_image_full
Bug Fixes
Add conditionals for IPv6 sysctl settings that have IPV6 disabled in kernel. Changing sysctl settings related to IPv6 on those systems lead to errors. LP#1906306
Fixes trove module imports. Path to the modules needed by trove-api changed in source trove package so the configuration was updated. LP#1937120
Fixes
ovs-dpdk
images pull. LP#[2041864]
Fixes configuration of nova-compute and nova-compute-ironic, that will enable exposing vendordata over configdrive. LP#2049607
Modifies the MariaDB procedure to use the same container image as the running MariaDB server container. This should prevent compatibility issues that may cause the backup to fail.
Fixes a bug where Nova and Cinder would not register the required keystone service role if keystone tags are skipped. LP#2049762
Fixed ‘cinder-backup’ service when Swift with TLS enabled. LP#2051986
Fixes an idempotency issue in the OpenSearch upgrade tasks where subsequent runs of kolla-ansible upgrade would leave shard allocation disabled. LP#2049512
Fixed an issue where the MariaDB Cluster recovery process would fail if the sequence number was not found in the logs. The recovery process now checks the complete log file for the sequence number and recovers the cluster. See LP#1821173 for details.
A precheck has been added to catch when
om_enable_rabbitmq_quorum_queues
is set toTrue
, but quorum queues have not been configured on all appropriate queues. A manual migration is required, see here for details: https://docs.openstack.org/kolla-ansible/latest/reference/message-queues/rabbitmq.html#high-availability LP#2045887
All stable RabbitMQ feature flags are now enabled during deployments, reconfigures, and upgrades. As such, the variable
rabbitmq_feature_flags
is no longer required. This is a partial fix to RabbitMQ SLURP support. LP#2049512
Fixes an issue where the Keystone admin endpoint would be recreated when upgrading Keystone. The endpoint is now explicitly removed during the upgrade process.
Fixes skyline’s old format of stop task. It used docker_container which would cause problems with podman deployments.
ks2024.2
New Features
Adds CADF audit events support for the adminUI service. To enable it, set enable_cadf_audit: “yes” in your region’s globals.d file.
- Added support for the following scenarios of VictoriaMetrics cluster deployment:
VictoriaMetrics as drop-in replacement of Prometheus
VictoriaMetrics as remote storage for Prometheus
To deploy VictoriaMetrics as drop-in replacement of Prometheus use parameter
enable_victoriametrics
(default: “no”). The groupvictoriametrics
should be added to inventory. vmagent, vmalert, cluster version of vminsert, vmselect and vmstorage will be deployed.To deploy VictoriaMetrics as remote storage for Prometheus use parameter
enable_prometheus_backend_victoriametrics
(default: “no”). The groupvictoriametrics
should be added to inventory. Just cluster version of vminsert, vmselect and vmstorage will be deployed.VictoriaMetrics cluster supports multiple isolated tenants (aka namespaces). Tenants are identified by accountID which are put inside request urls. Parameter
victoriametrics_account_id
(default: “0”) allows to set accountID for cluster deployment.
Added support to replicate collected metrics to remote (central) Prometheus-compatible remote storage systems via Prometheus remote_write protocol or via VictoriaMetrics remote_write protocol depending on source and target system <https://docs.victoriametrics.com/>. Use parameter
prometheus_central_url
(default: “”) to set up url of remote storage system.
The hosts variable has been added to adminui-backend-osloconf.conf. This variable is a list of memcached hosts for the admin_ui_backend.
The list_of_service_domains variable has been added to adminui-backend-osloconf.conf. This variable is a list of service domains for the admin_ui_backend. Domains from this list are not displayed on login page
To avoid cadf events being written to rabbitmq durable queue, when other services are using non-durable queue,
oslo_messaging_rabbit
section options have been added to the drs service.
Added ability to copy storage vendor configs for cinder-volume. To use custom vendor configs you have to put them into
/etc/kolla/config/cinder/cinder-volume/vendor-configs
. Configs could be a templates and use kolla-ansible variables.
Implements
hardreboot
fence method to HA. To control fence methodbmc_power_fence_mode
variable has been added. Possible values: * poweroff (default) * hardreboot
Added the ability to use a configuration with 3 glance-api nodes when using cinder as a backend
Upgrade Notes
Add ldap configuration for grafana in templates in tasks. Examples configurating ldap in ldap.toml for grafana: grafana_security_ldap_enabled: “yes” grafana_security_ldap_host: “mydomain.com” grafana_security_ldap_port: “389” grafana_security_ldap_use_ssl: “false” grafana_security_ldap_start_tls: “false” grafana_security_ldap_ssl_skip_verify: “true” grafana_security_ldap_bind_dn: “cn=adadmin,cn=Users,dc=mydomain,dc=com” grafana_security_ldap_bind_password: “mypassword” grafana_security_ldap_search_filter: “(sAMAccountName=%s)” grafana_security_ldap_search_base_dns: “dc=mydomain,dc=com” grafana_security_ldap_group_dn: “cn=Domain Admins,cn=Users,dc=mydomain,dc=com” For change roles mapping and groups inspect in ldap.toml section “servers.group_mappings”.
Bug Fixes
Updates the default Grafana Prometheus datasource configuration to connect to prometheus server when
enable_prometheus
set to “yes”, but connect to VictoriaMetrics whenenable_prometheus
set to “no”. VictoriaMetrics datasource type will be provision ifenable_victoriametrics
set to “yes” andgrafana_victoriametrics_datasource_type == 'victoriametrics'
. Default value ofgrafana_victoriametrics_datasource_type
changed toprometheus
.
Fixes the issue when haproxy configuration of prometheus exporters not applyed when
enable_prometheus
set to “no” andenable_victoriametrics
set to “yes”.
ks2024.1.2-requests-fix
New Features
Added support for building containers with kolla-ansible on Sberlinux
Add “enable_cadf_audit for support of cadf audit events in services nova, cinder, neutron, glance, heat, drs. Added api-paste.ini and api_audit_map.conf files and audit_middleware_notifications config block.
Deploys and configures a prometheus-ovs-exporter image as part of the Prometheus monitoring stack.
Adds support apparmor for Libvirt in Ubuntu.
Switch
adminui-backend
to WSGI running under Apache.
Added configuration options to enable backend TLS encryption from HAProxy to the Adminui Backend service.
Glance, cinder, manila services now support configuration of multiple ceph cluster backends. For nova and gnocchi there is the possibility to configure different ceph clusters - for gnocchi this is possible at the service level while for nova at the host level. See the external ceph guide docs. on how to set multiple ceph backends for more details.
Allow overriding of Skyline configuration files by supplying your own version of nginx.conf for Skyline Console, gunicorn.py and skyline.yaml for Skyline API Server. Place the files in the skyline subfolder of your Kolla config directory, skyline.yaml will be merged with the Kolla provided version.
Upgrade Notes
The default value for
ceph_cinder_keyring
has been changed from: “ceph.client.cinder.keyring” to: “client.{{ ceph_cinder_user }}.keyring”the default value for
ceph_cinder_backup_keyring
has been changed from: “ceph.client.cinder-backup.keyring” to: “client.{{ ceph_cinder_backup_user }}.keyring”the default value for
ceph_glance_keyring
has been changed from: “ceph.client.glance.keyring” to: “client.{{ ceph_glance_user }}.keyring”the default value for
ceph_manila_keyring
has been changed from: “ceph.client.manila.keyring” to: “client.{{ ceph_manila_user }}.keyring”and the default value for
ceph_gnocchi_keyring
has been changed from: “ceph.client.gnocchi.keyring” to: “client.{{ ceph_gnocchi_user }}.keyring”User who did override default values for the above variables have to change them according to the new pattern.
16.5.0
Upgrade Notes
If credentials are updated in
passwords.yml
kolla-ansible is now able to update these credentials in the keystone database and in the on disk config files.The changes to
passwords.yml
are applied oncekolla-ansible -i INVENTORY
reconfigure has been run.If you want to revert to the old behavior - credentials not automatically updating during reconfigure if they changed in
passwords.yml
- you can specify this by settingupdate_keystone_service_user_passwords: false
in your globals.yml.Notice that passwords are only changed if you change them in
passwords.yml
. This mechanism is not a complete solution for automatic credential rollover. No passwords are changed if you do not change them insidepasswords.yml
.
Bug Fixes
Fixes mariadb role deployment when using Ansible check mode. LP#2052501
Updated configuration of service user tokens for all Nova and Cinder services to stop using admin role for service_token and use service role.
See LP#[2004555] and LP#[2049762] for more details.
Add Keystone Service role. Keystone is creating service in bootstrap since Bobcat. Service role is needed for SLURP to work from Antelope. This role is also needed in Antelope and Zed for Cinder for proper service token support. LP#2049762
Changes to service user passwords in
passwords.yml
will now be applied when reconfiguring services.This behaviour can reverted by setting
update_keystone_service_user_passwords: false
.Fixes LP#2045990
ks2024.1
Prelude
- Support for powerdns include new database with name powerdns in existing galera cluster
The name can be overridden by a variable designate_database_pdns_name.
New Features
Add support noVNC TLS. To enable, add an option to the region’s globals nova_qemu_vnc_tls: “yes”
Added new component prometheus-alertmanager-hpsm. Allows you to send alerts from the allermanager to Zabbix.
Add Autoevacuate (consule role) feature
Added support for powerdns backend Use option designate_backend: “pdns4” in globsls if you want powerdns as backend
Add ansible role for DRS aka Openstack Load Leveller aka Dynamic Load Rebalancer.
The new command
kolla-ansible rabbitmq-reset-state
has been added. It force-resets the state of RabbitMQ. This is primarily designed to be used when enabling HA queues, see docs: https://docs.openstack.org/kolla-ansible/latest/reference/message-queues/rabbitmq.html#high-availability
Add an external prometheus exporter for rabbitmq to complement the internal one, as they don’t fully overlap on the metrics that they collect.
Updates apache grok pattern to match the size of response in bytes, time taken to serve the request and user agent.
Enables credentials auth for adminui. Now we can use logopass and token to authenticate adminui console.
You can now enable the usage of quorum queues in RabbitMQ for all services by setting the variable
om_enable_rabbitmq_quorum_queues
totrue
. Notice that you can’t use quorum queues and high availability at the same time. This is caught by a precheck.
Set a log retention policy for OpenSearch via Index State Management (ISM). Documentation.
Adds new
restart_policy
calledoneshot
that does not create systemd units and is used for bootstrap tasks.
Redfish support added for VMHA fencing driver
Upgrade Notes
Added log retention in OpenSearch, previously handled by Elasticsearch Curator. By default the soft and hard retention periods are 30 and 60 days respectively. If you are upgrading from Elasticsearch, and have previously configured
elasticsearch_curator_soft_retention_period_days
orelasticsearch_curator_hard_retention_period_days
, those variables will be used instead of the defaults. You should migrate your configuration to use the new variable names before the Caracal release.
restart_policy: no
will now create systemd units, but withRestart
property set tono
.
Next VMHA config options has changed: - ipmi_prefix -> bmc_suffix - ipmi_user -> bmc_user - ipmi_password -> bmc_password New VMHA config options: - bmc_verify_ssl: “False|True|<path to CA file>” defaults to openstack_cacert if defined or False
Bug Fixes
Fix adminui role for Release 2024.1. Delete some variables, add new variables. Add new configuration file for multiregions adminui.
Fix role drs. Upd environments for drs. Disable tls_backend for Release ks2024.1. Fix restart_policy to oneshot for boorstrap drs.
Fix MariaDB backup if enable_proxysql is enable
Fixes 504 timeout when scraping openstack exporter. Ensures that HAProxy server timeout is the same as the scrape timeout for the openstack exporter backend. LP#2006051
Fix issue with octavia security group rules creation when using IPv6 configuration for octavia management network. See LP#2023502 for more details.
Fixes glance-api failed to start privsep daemon when cinder_backend_ceph is set to true. See LP#2024541 for more details.
Fixes 2024554. Adds host and
mariadb_port
to the wsrep sync status check. This is so none standard ports can be used for mariadb deployments. LP#2024554
Starting with ansible-core 2.13, list concatenation format is changed which resulted in inability to override horizon policy files. See LP#2045660 for more details.
Fixes an issue where Prometheus would fail to scrape the OpenStack exporter when using internal TLS with an FQDN. LP#2008208
Fixes Docker health check for the
sahara_engine
container. LP#2046268
Fixes an issue where Fluentd was parsing Horizon WSGI application logs incorrectly. Horizon error logs are now written to
horizon-error.log
instead ofhorizon.log
. See LP#1898174
Added log retention in OpenSearch, previously handled by Elasticsearch Curator, now using Index State Management (ISM) OpenSearch bundled plugin. LP#2047037.
Fixes an issue where Prometheus scraping of Etcd metrics would fail if Etcd TLS is enabled. LP#2036950