Skip to content

Conversation

sjpb
Copy link
Collaborator

@sjpb sjpb commented Sep 17, 2025

This PR reorganises configuration of build VM inventory groups and improves image build documentation:

  • For extra builds, the builder groups are now entirely defined using the Packer inventory_groups variable, rather than also having builder as a child of groups in environments/{common,site}/inventory/groups.
  • A group fatimage is defined, and added as a child of groups in the site groups file, to define the default functionality enabled for fat image builds.
  • Group dependencies have been modified so that dnf_repos group is only added when required.

This fixes a number of issues:

  1. It is clearer which features are active for an extra build, and how to change this.
  2. Extra builds can run with minimal functionality. Specifically:
    a. Ark credentials are not always required (because dnf_repos role is no longer unconditionally enabled)
    b. The gateway role does not re-run unnecessarily during a non-stackhpc extra build
    c. The sssd, tuned, squid and raid roles do not re-run unecessarily during a stackhpc extra build
  3. The features a "fat image build" enables are actually defined. By default, a site-specific fat image will have the same features enabled as a StackHPC build, but this is configurable via the site groups file.

@sjpb
Copy link
Collaborator Author

sjpb commented Sep 17, 2025

@sjpb
Copy link
Collaborator Author

sjpb commented Sep 17, 2025

@sjpb sjpb force-pushed the feat/minimal-extrabuilds branch from ff87ef8 to 6ab7255 Compare September 17, 2025 15:16
@sjpb
Copy link
Collaborator Author

sjpb commented Sep 17, 2025

Fat image build: https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/17802381929

failed on

RL8:
==> openstack.openhpc: "No package ceph-common-17.2.7 available.
RL9:
==> openstack.openhpc: "No package ceph-common-18.2.4 available."

maybe repos are messed up?

@sjpb
Copy link
Collaborator Author

sjpb commented Sep 18, 2025

Base automatically changed from feat/root-mdadm to main September 25, 2025 19:48
@sjpb sjpb closed this Oct 1, 2025
@sjpb sjpb reopened this Oct 1, 2025
@sjpb sjpb force-pushed the feat/minimal-extrabuilds branch from 876787e to 480336e Compare October 2, 2025 10:15
@sjpb sjpb force-pushed the feat/minimal-extrabuilds branch from f90c1da to 876787e Compare October 2, 2025 13:22
@sjpb
Copy link
Collaborator Author

sjpb commented Oct 2, 2025

@sjpb
Copy link
Collaborator Author

sjpb commented Oct 2, 2025

…e-slurm-appliance into feat/minimal-extrabuilds
@sjpb
Copy link
Collaborator Author

sjpb commented Oct 2, 2025

@sjpb
Copy link
Collaborator Author

sjpb commented Oct 3, 2025

Inventory groups in extrabuild runs above:
RL8:

==> openstack.openhpc:     "group_names": [
==> openstack.openhpc:         "builder",
==> openstack.openhpc:         "cuda",
==> openstack.openhpc:         "dnf_repos",
==> openstack.openhpc:         "doca",
==> openstack.openhpc:         "slurm_recompile"
==> openstack.openhpc:     ]
==> openstack.openhpc: }

RL9:

==> openstack.openhpc:     "group_names": [
==> openstack.openhpc:         "builder",
==> openstack.openhpc:         "cuda",
==> openstack.openhpc:         "dnf_repos",
==> openstack.openhpc:         "doca",
==> openstack.openhpc:         "lustre",
==> openstack.openhpc:         "slurm_recompile"
==> openstack.openhpc:     ]

Looks OK

@sjpb sjpb marked this pull request as ready for review October 3, 2025 09:35
@sjpb sjpb requested a review from a team as a code owner October 3, 2025 09:35
Copy link
Member

@bertiethorpe bertiethorpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is a much clearer way of doing extra builds imo!

@sjpb sjpb merged commit b11696e into main Oct 3, 2025
41 of 42 checks passed
@sjpb sjpb deleted the feat/minimal-extrabuilds branch October 3, 2025 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants