You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
then delete the failed volume, select cancelling the build when Packer queries, and then retry. This is [Openstack bug 1823445](https://bugs.launchpad.net/cinder/+bug/1823445).
86
+
77
87
# Build Process
78
88
79
89
In summary, Packer creates an OpenStack VM, runs Ansible on that, shuts it down, then creates an image from the root disk.
80
90
81
-
Many of the Packer variables defined in `openstack.pkr.hcl` control the definition of the build VM and how to SSH to it to run Ansible, which are generic OpenStack builder options. Packer varibles can be set in a file at any convenient path; the above
82
-
example shows the use of the environment variable `$PKR_VAR_environment_root` (which itself sets the Packer variable
91
+
Many of the Packer variables defined in `openstack.pkr.hcl` control the definition of the build VM and how to SSH to it to run Ansible. These are generic OpenStack builder options
92
+
and are not specific to the Slurm Appliance. Packer varibles can be set in a file at any convenient path; the build example above
93
+
shows the use of the environment variable `$PKR_VAR_environment_root` (which itself sets the Packer variable
83
94
`environment_root`) to automatically select a variable file from the current environment, but for site-specific builds
84
-
using a path in a "parent" environment is likely to be more appropriate (as builds should not be environment-specific, to allow testing).
95
+
using a path in a "parent" environment is likely to be more appropriate (as builds should not be environment-specific to allow testing before deployment to a production environment).
85
96
86
97
What is Slurm Appliance-specific are the details of how Ansible is run:
87
-
- The build VM is always added to the `builder` inventory group, which differentiates it from "real" nodes. This allows
88
-
variables to be set differently during Packer builds, e.g. to prevent services starting. The defaults for this are in `environments/common/inventory/group_vars/builder/`, which could be extended or overriden for site-specific fat image builds using `builder` groupvars for the relevant environment. It also runs some builder-specific code (e.g. to ensure Packer's SSH
89
-
keys are removed from the image).
90
-
- The default fat image build also adds the build VM to the "top-level" `compute`, `control` and `login` groups. This ensures
91
-
the Ansible specific to all of these types of nodes run (other inventory groups are constructed from these by `environments/common/inventory/groups file` - this is not builder-specific).
92
-
- Which groups the build VM is added to is controlled by the Packer `groups` variable. This can be redefined for builds using the `openhpc-extra` source to add the build VM into specific groups. E.g. with a Packer variable file:
93
-
94
-
source_image_name = {
95
-
RL9 = "openhpc-ofed-RL9-240619-0949-66c0e540"
96
-
}
97
-
groups = {
98
-
openhpc-extra = ["foo"]
99
-
}
100
-
101
-
the build VM uses an existing "fat image" (rather than a 'latest' nightly one) and is added to the `builder` and `foo` groups. This means only code targeting `builder` and `foo` groups runs. In this way an existing image can be extended with site-specific code, without modifying the part of the image which has already been tested in the StackHPC CI.
102
-
103
-
- The playbook `ansible/fatimage.yml` is run which is only a subset of `ansible/site.yml`. This allows restricting the code
104
-
which runs during build for cases where setting `builder` groupvars is not sufficient (e.g. a role always attempts to configure or start services). This may eventually be removed.
98
+
- The build VM is always added to the `builder` inventory group, which differentiates it from nodes in a cluster. This allows
99
+
Ansible variables to be set differently during Packer builds, e.g. to prevent services starting. The defaults for this are in `environments/common/inventory/group_vars/builder/`, which could be extended or overriden for site-specific fat image builds using `builder` groupvars for the relevant environment. It also runs some builder-specific code (e.g. to clean up the image).
100
+
- The default fat image builds also add the build VM to the "top-level" `compute`, `control` and `login` groups. This ensures
101
+
the Ansible specific to all of these types of nodes run. Note other inventory groups are constructed from these by `environments/common/inventory/groups file` - this is not builder-specific.
102
+
- As noted above, for "extra" builds the additional groups can be specified directly. In this way an existing image can be extended with site-specific Ansible, without modifying the
103
+
part of the image which has already been tested in the StackHPC CI.
104
+
- The playbook `ansible/fatimage.yml` is run which is only a subset of `ansible/site.yml`. This allows restricting the code which runs during build for cases where setting `builder`
105
+
groupvars is not sufficient (e.g. a role always attempts to configure or start services).
105
106
106
107
There are some things to be aware of when developing Ansible to run in a Packer build VM:
107
-
- Only some tasks make sense. E.g. any services with a reliance on the network cannot be started, and may not be able to be enabled if when creating an instance with the resulting image the remote service will not be immediately present.
108
+
- Only some tasks make sense. E.g. any services with a reliance on the network cannot be started, and should not be enabled if, when creating an instance with the resulting image, the remote service will not be immediately present.
108
109
- Nothing should be written to the persistent state directory `appliances_state_dir`, as this is on the root filesystem rather than an OpenStack volume.
109
-
- Care should be taken not to leave data on the root filesystem which is not wanted in the final image, (e.g secrets).
110
+
- Care should be taken not to leave data on the root filesystem which is not wanted in the final image (e.g secrets).
110
111
- Build VM hostnames are not the same as for equivalent "real" hosts and do not contain `login`, `control` etc. Therefore variables used by the build VM must be defined as groupvars not hostvars.
111
-
- Ansible may need to proxy to real compute nodes. If Packer should not use the same proxy to connect to the
112
-
build VMs (e.g. build happens on a different network), proxy configuration should not be added to the `all` group.
113
-
- Currently two fat image "sources" are defined, with and without CUDA. This simplifies CI configuration by allowing the
114
-
default source images to be defined in the `openstack.pkr.hcl` definition.
112
+
- Ansible may need to use a proxyjump to reach cluster nodes, which can be defined via Ansible's `ansible_ssh_common_args` variable. If Packer should not use the same proxy
113
+
to connect to build VMs (e.g. because build happens on a different network), this proxy configuration should not be added to the `all` group.
0 commit comments