Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/stackhpc-build-kayobe-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ on:
push:
branches:
# NOTE(upgrade): Reference only the current release branch here.
- stackhpc/2024.1
- stackhpc/master

workflow_call:
inputs:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/stackhpc-multinode.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,6 @@ jobs:
ssh_key: ${{ inputs.ssh_key }}
stackhpc_kayobe_config_version: ${{ github.ref_name }}
# NOTE(upgrade): Reference the PREVIOUS release here.
stackhpc_kayobe_config_previous_version: ${{ inputs.upgrade == 'major' && 'stackhpc/2023.1' || 'stackhpc/2024.1' }}
stackhpc_kayobe_config_previous_version: ${{ inputs.upgrade == 'major' && 'stackhpc/2024.1' || 'stackhpc/master' }}
terraform_kayobe_multinode_version: ${{ inputs.terraform_kayobe_multinode_version }}
secrets: inherit
2 changes: 1 addition & 1 deletion .github/workflows/stackhpc-promote.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ on:
push:
branches:
# NOTE(upgrade): Reference only the current release branch here.
- stackhpc/2024.1
- stackhpc/master
jobs:
promote:
name: Trigger Pulp promotion workflows
Expand Down
2 changes: 1 addition & 1 deletion .gitreview
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
host=review.opendev.org
port=29418
project=openstack/kayobe-config.git
defaultbranch=stable/2024.1
defaultbranch=master
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ build:
python: "3.7"
jobs:
post_checkout:
- git remote set-branches origin master stackhpc/2024.1 stackhpc/2023.1 stackhpc/zed stackhpc/yoga stackhpc/xena stackhpc/wallaby
- git remote set-branches origin master stackhpc/master stackhpc/2024.1 stackhpc/2023.1 stackhpc/zed stackhpc/yoga stackhpc/xena stackhpc/wallaby
- git fetch --unshallow

# Build documentation in the doc/source/ directory with Sphinx
Expand Down
4 changes: 2 additions & 2 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@
# -- StackHPC Kayobe configuration --------------------------------------
# Variables to override

current_series = "2024.1"
previous_series = "2023.1"
current_series = "master"
previous_series = "2024.1"
branch = f"stackhpc/{current_series}"
ceph_series = "squid"

Expand Down
2 changes: 1 addition & 1 deletion doc/source/contributor/environments/aufn-ceph.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ This environment creates a Universe-from-nothing_-style deployment of Kayobe con
.. warning::

This guide was written for the Yoga release and has not been validated for
Caracal. Proceed with caution.
Master. Proceed with caution.

Prerequisites
=============
Expand Down
6 changes: 3 additions & 3 deletions doc/source/contributor/environments/ci-aio.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Download the setup script:

.. parsed-literal::

wget https://raw.githubusercontent.com/stackhpc/stackhpc-kayobe-config/stackhpc/2024.1/etc/kayobe/environments/ci-aio/automated-setup.sh
wget https://raw.githubusercontent.com/stackhpc/stackhpc-kayobe-config/stackhpc/master/etc/kayobe/environments/ci-aio/automated-setup.sh

Change the permissions on the script:

Expand All @@ -52,9 +52,9 @@ following options:

* ``BASE_PATH`` (default: ``~``) - Directory to deploy from. The directory must
exist before running the script.
* ``KAYOBE_BRANCH`` (default: ``stackhpc/2024.1``) - The branch of Kayobe
* ``KAYOBE_BRANCH`` (default: ``stackhpc/master``) - The branch of Kayobe
source code to use.
* ``KAYOBE_CONFIG_BRANCH`` (default: ``stackhpc/2024.1``) - The branch of
* ``KAYOBE_CONFIG_BRANCH`` (default: ``stackhpc/master``) - The branch of
``stackhpc-kayobe-config`` to use.
* ``KAYOBE_AIO_LVM`` (default: ``true``) - Whether the image uses LVM.
* ``KAYOBE_CONFIG_EDIT_PAUSE`` (default: ``false``) - Option to pause
Expand Down
2 changes: 1 addition & 1 deletion doc/source/contributor/environments/ci-builder.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ and pushed there once built.
.. warning::

This guide was written for the Yoga release and has not been validated for
Caracal. Proceed with caution.
Master. Proceed with caution.

In general it is preferable to use the `container image build CI workflow
<https://github.com/stackhpc/stackhpc-kayobe-config/actions/workflows/stackhpc-container-image-build.yml>`_
Expand Down
2 changes: 1 addition & 1 deletion doc/source/contributor/environments/ci-multinode.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Multinode Test Environment
.. warning::

This guide was written for the Yoga release and has not been validated for
Caracal. Proceed with caution.
Master. Proceed with caution.

The ``ci-multinode`` environment provides a Kayobe configuration for multi-node
clouds to be used for test and development purposes. It is designed to be used
Expand Down
10 changes: 5 additions & 5 deletions doc/source/contributor/package-updates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ This section describes the Release Train process of creating new package reposit
Preparations
============

1. Before building images, you should check for any outstanding PRs into the earliest supported release. Below are the links for the 2024.1 (Caracal) branches.
1. Before building images, you should check for any outstanding PRs into the earliest supported release. Below are the links for the Master branches.

kayobe-config: https://github.com/stackhpc/stackhpc-kayobe-config/pulls?q=is%3Apr+is%3Aopen+base%3Astackhpc%2F2024.1
kayobe-config: https://github.com/stackhpc/stackhpc-kayobe-config/pulls?q=is%3Apr+is%3Aopen+base%3Astackhpc%2Fmaster

kolla: https://github.com/stackhpc/kolla/pulls?q=is%3Apr+is%3Aopen+base%3Astackhpc%2F2024.1
kolla: https://github.com/stackhpc/kolla/pulls?q=is%3Apr+is%3Aopen+base%3Astackhpc%2Fmaster

kolla-ansible: https://github.com/stackhpc/kolla-ansible/pulls?q=is%3Apr+is%3Aopen+base%3Astackhpc%2F2024.1
kolla-ansible: https://github.com/stackhpc/kolla-ansible/pulls?q=is%3Apr+is%3Aopen+base%3Astackhpc%2Fmaster

You should also check any referenced source trees in etc/kayobe/kolla.yml.

Expand Down Expand Up @@ -165,7 +165,7 @@ Upgrading OpenStack to the next release in a multinode environment
.. warning::

This guide was written for the Wallaby release and has not been validated
for Caracal. Proceed with caution.
for master. Proceed with caution.

As this is not a full production system, only a reduced number of steps need to be followed to upgrade to a new release. Below describes these steps, with ``stackhpc/wallaby`` as the starting branch:

Expand Down
2 changes: 1 addition & 1 deletion doc/source/contributor/testing-ci-automation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ job.
The workflow performs the following high-level steps:

#. Deploy a VM on an OpenStack cloud using the `aio
<https://github.com/stackhpc/stackhpc-kayobe-config/tree/stackhpc/2024.1/terraform/aio>`_
<https://github.com/stackhpc/stackhpc-kayobe-config/tree/stackhpc/master/terraform/aio>`_
Terraform configuration.
#. Deploy OpenStack in the VM using Kayobe and the :doc:`ci-aio
<environments/ci-aio>` environment. If this is an upgrade job, the previous
Expand Down
231 changes: 10 additions & 221 deletions doc/source/operations/upgrading-openstack.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,183 +35,20 @@ Notable changes in the |current_release| Release
There are many changes in the OpenStack |current_release| release described in
the release notes for each project. Here are some notable ones.

Heat disabled by default
------------------------
.. TODO Add notable changes

The Heat OpenStack service is no longer enabled by default.

This behavior can be overridden manually:

.. code-block:: yaml
:caption: ``kolla.yml``

kolla_enable_heat: true

Wherever possible, Magnum deployments should be migrated to the CAPI Helm
driver. Instructions for enabling the driver can be found `here
<../configuration/magnum-capi.rst>`_. Enable the driver, recreate any clusters
using Heat, and disable the service.

After the upgrade (so that alerts don't fire) you can remove Heat with the
following:

.. code-block:: console

kayobe overcloud host command run --command "rm /etc/kolla/haproxy/services.d/heat-api.cfg" -l network -b
kayobe overcloud host command run --command "rm /etc/kolla/haproxy/services.d/heat-api-cfn.cfg" -l network -b

kayobe overcloud host command run --command "systemctl restart kolla-haproxy-container.service" -l network[0] -b
kayobe overcloud host command run --command "systemctl restart kolla-haproxy-container.service" -l network[1] -b
kayobe overcloud host command run --command "systemctl restart kolla-haproxy-container.service" -l network[2] -b

kayobe overcloud host command run --command "systemctl stop kolla-heat_api-container.service kolla-heat_api_cfn-container.service kolla-heat_engine-container.service" -l controllers -b
kayobe overcloud host command run --command "systemctl disable kolla-heat_api-container.service kolla-heat_api_cfn-container.service kolla-heat_engine-container.service" -l controllers -b
kayobe overcloud host command run --command "rm /etc/systemd/system/kolla-heat_api-container.service" -l controllers -b
kayobe overcloud host command run --command "rm /etc/systemd/system/kolla-heat_api_cfn-container.service" -l controllers -b
kayobe overcloud host command run --command "rm /etc/systemd/system/kolla-heat_engine-container.service" -l controllers -b

kayobe overcloud host command run --command "docker rm heat_api heat_api_cfn heat_engine" -l controllers

kayobe overcloud host command run --command "rm -rf /etc/kolla/heat-api /etc/kolla/heat-api-cfn /etc/kolla/heat-engine" --limit controllers -b

Then from the OpenStack CLI:

.. code-block:: console

openstack service delete heat
openstack user delete heat
openstack domain set --disable heat_user_domain
openstack domain delete heat_user_domain
openstack endpoint list --service heat -c ID -f value | xargs openstack endpoint delete
openstack endpoint list --service heat-cfn -c ID -f value | xargs openstack endpoint delete

You can drop the ``heat`` database too, unless you want to keep historical content.

.. code-block:: console

docker exec -it mariadb mysql -u root -p
Enter the database password when prompted.
drop database heat;

Designate sink disabled by default
----------------------------------

Designate sink is an optional Designate service which listens for event
notifications, primarily from Nova and Neutron. It is disabled by default (when
designate is enabled) in Caracal. It is not required for Designate to function.

If you still wish to use it, you should set the flag manually:

.. code-block:: yaml
:caption: ``kolla/globals.yml``

designate_enable_notifications_sink: true

If you are using Designate and do not make this change, the Antelope
``designate-sink`` container will remain on the controllers after the upgrade.
It must be removed manually.

Grafana Volume
--------------
The Grafana container volume is no longer used. If you wish to automatically
remove the old volume, set ``grafana_remove_old_volume`` to ``true`` in
``kolla/globals.yml``. Note that doing this will lose any plugins installed via
the CLI directly and not through Kolla. If you have previously installed
Grafana plugins via the Grafana UI or CLI, you must change to installing them
at image build time. The Grafana volume, which contains existing custom
plugins, will be automatically removed in the next release.

Prometheus HAproxy Exporter
---------------------------
Due to the change from using the ``prometheus-haproxy-exporter`` to using the
native support for Prometheus which is now built into HAProxy, metric names may
have been replaced and/or removed, and in some cases the metric names may have
remained the same but the labels may have changed. Alerts and dashboards may
also need to be updated to use the new metrics. Please review any configuration
that references the old metrics as this is not a backwards compatible change.

Horizon configuration
---------------------
The Horizon role has been reworked to the preferred ``local_settings.d``
configuration model. Files ``local_settings`` and ``custom_local_settings``
have been renamed to ``_9998-kolla-settings.py`` and
``_9999-custom-settings.py`` respectively. Users who use Horizon's custom
configuration must change the names of those files in
``etc/kolla/config/horizon`` as well.

Neutron DNS Domain
------------------
When Designate is enabled and the default Neutron DNS integration has not been
disabled, ``neutron_dns_domain`` must be configured manually in
``kolla/globals.yml``.

The ``neutron_dns_domain`` must end with a period ``.`` e.g. ``example.com.``.
The domain set should be something that is not use anywhere else such as
``internal.compute.example.com.``

The Neutron DNS integration can be disabled by setting
``neutron_dns_integration: false`` in ``kolla/globals.yml``

Redis Default User
------------------

The ``redis_connection_string`` has changed the username used from ``admin``
to ``default``. Whilst this does not have any negative impact on services
that utilise Redis it will feature prominently in any preview of the overcloud
configuration.

AvailabilityZoneFilter removal
------------------------------

Support for the ``AvailabilityZoneFilter`` filter has been dropped in Nova.
Remove it from any Nova config files before upgrading. It will cause errors in
Caracal and halt the Nova scheduler.
Placeholder
-----------

Known issues
============

* Due to an incorrect default value NGS will attempt to use v3alpha for the api
path when communicating with etcd3. This isn't possible as in Caracal etcd is
running a newer version that has dropped support for v3alpha. You can work
around this in custom config, see the SMS PR for an example:
https://github.com/stackhpc/smslab-kayobe-config/pull/354

* Due to a `security-related change in the GRUB package on Rocky Linux 9
<https://access.redhat.com/security/cve/CVE-2023-4001>`__, the operating
system can become unbootable (boot will stop at a ``grub>`` prompt). Remove
the ``--root-dev-only`` option from ``/boot/efi/EFI/rocky/grub.cfg`` after
applying package updates. This will happen automatically as a post hook when
running the ``kayobe overcloud host package update`` command.

* After upgrading OpenSearch to the latest 2023.1 container image, we have seen
cluster routing allocation be disabled on some systems. See bug for details:
https://bugs.launchpad.net/kolla-ansible/+bug/2085943.
This will cause the "Perform a flush" handler to fail during the 2024.1
OpenSearch upgrade. To workaround this, you can run the following PUT request
to enable allocation again:

.. code-block:: console

curl -X PUT "https://<kolla-vip>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d '{ "transient" : { "cluster.routing.allocation.enable" : "all" } } '

* Cinder database migrations fail during the upgrade process when the
``use_quota`` column is set to ``NULL``, which can be the case on deleted
volumes and snapshots if OpenStack has been in operation for several
releases. See `Launchpad bug 2070475
<https://bugs.launchpad.net/cinder/+bug/2070475>`__ for details. Until the
`database migrations are fixed
<https://review.opendev.org/c/openstack/cinder/+/923635>`__, the data can be
fixed with the following MySQL queries:

.. code-block:: mysql

UPDATE volumes SET use_quota = 1 WHERE use_quota IS NULL AND deleted_at IS NOT NULL;
UPDATE snapshots SET use_quota = 1 WHERE use_quota IS NULL AND deleted_at IS NOT NULL;
* None so far!

Security baseline
=================

As part of the Caracal release we are looking to improve the security
As part of the Master release we are looking to improve the security
baseline of StackHPC OpenStack deployments. If any of the following have not
been done, they should be completed before the upgrade begins.

Expand Down Expand Up @@ -247,61 +84,13 @@ suggestions:
* Update the deployment to use the latest |previous_release| images and
configuration.

RabbitMQ SLURP upgrade
----------------------

.. note::
The upgrade is reliant on recent changes. Make sure you have updated to
the latest version of kolla ansible and deployed the latest kolla containers
before proceeding.

Because this is a SLURP upgrade, RabbitMQ must be upgraded manually from 3.11,
to 3.12, then to 3.13 on Antelope before the Caracal upgrade. This upgrade
should not cause an API outage (though it should still be considered "at
risk").

Some errors have been observed in testing when the upgrades are performed
back-to-back. A 200s delay eliminates this issue. On particularly large or slow
deployments, consider increasing this timeout.

Additionally errors have been observed at sites with OVS networking where after
the upgrade, tenant networking is broken and requires a reset of RabbitMQ. This
can be done by running the rabbitmq-reset playbook.

.. code-block:: bash

kayobe overcloud service configuration generate --node-config-dir /tmp/ignore -kt none
kayobe kolla ansible run "rabbitmq-upgrade 3.12"
sleep 200
kayobe kolla ansible run "rabbitmq-upgrade 3.13"

RabbitMQ quorum queues
Ubuntu Noble migration
----------------------

In Caracal, quorum queues are enabled by default for RabbitMQ. This is
different to Antelope which used HA queues. Before upgrading to Caracal, it is
strongly recommended that you migrate from HA to quorum queues. The migration
is automated using a script.

.. warning::
This migration will stop all services using RabbitMQ and cause an
extended API outage while queues are migrated. It should only be
performed in a pre-agreed maintenance window.

Set the following variables in your kolla globals file (i.e.
``$KAYOBE_CONFIG_PATH/kolla/globals.yml`` or
``$KAYOBE_CONFIG_PATH/environments/$KAYOBE_ENVIRONMENT/kolla/globals.yml``):

.. code-block:: yaml

om_enable_rabbitmq_high_availability: false
om_enable_rabbitmq_quorum_queues: true

Then execute the migration script:

.. code-block:: bash

$KAYOBE_CONFIG_PATH/../../tools/rabbitmq-quorum-migration.sh
Ubuntu Jammy support has been removed from the 2025.1 release onwards. Hosts
must be migrated to Ubuntu 24.04 before upgrading OpenStack services. The
upgrade process is currently a work in progress.
.. TODO: Add link to another page describing how to migrate

Preparation
===========
Expand Down
Loading
Loading