Skip to content

Commit 64540f3

Browse files
authored
Merge pull request #30 from stackhpc/ironic
Add optional baremetal management section & CI to build docs
2 parents 533f771 + 3bd135f commit 64540f3

13 files changed

+352
-3
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
ceph: false
3+
ceph_ansible: false
4+
ceph_managed: false
5+
ironic: false
6+
ironic_automated_cleaning: true
7+
kayobe_manages_physical_network: true
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
ceph: true
3+
ceph_ansible: false
4+
ceph_managed: false
5+
ironic: true
6+
ironic_automated_cleaning: false
7+
kayobe_manages_physical_network: false
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
ceph: true
3+
ceph_ansible: false
4+
ceph_managed: false
5+
ironic: true
6+
ironic_automated_cleaning: true
7+
kayobe_manages_physical_network: true

.github/workflows/pull_request.yml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
---
2+
name: Build OpenStack admin guide
3+
on:
4+
- pull_request
5+
jobs:
6+
build:
7+
name: Build OpenStack admin guide
8+
runs-on: ubuntu-latest
9+
strategy:
10+
matrix:
11+
deployment_yaml:
12+
- __default__
13+
- disable-ceph
14+
- enable-ironic
15+
- enable-ironic-no-cleaning-or-physnet
16+
steps:
17+
- uses: actions/checkout@v3
18+
19+
- uses: actions/setup-python@v4
20+
with:
21+
python-version: '3.x'
22+
23+
- name: Install Python dependencies
24+
run: pip3 install -r requirements.txt
25+
26+
- name: Copy deployment.yml into place
27+
run: cp .github/workflows/deployment_yaml/${{ matrix.deployment_yaml }}.yml source/data/deployment.yml
28+
if: matrix.deployment_yaml != '__default__'
29+
30+
- name: Build HTML
31+
run: make html

source/baremetal_management.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
.. include:: vars.rst
2+
3+
======================================
4+
Bare Metal Compute Hardware Management
5+
======================================
6+
7+
.. ifconfig:: deployment['ironic']
8+
9+
The |project_name| cloud includes bare metal compute nodes managed by the
10+
Ironic services. This section describes elements of the configuration of
11+
this service.
12+
13+
.. include:: include/baremetal_management.rst
14+
15+
.. ifconfig:: not deployment['ironic']
16+
17+
The |project_name| cloud does not include bare metal compute nodes managed
18+
by the Ironic services.

source/ceph_storage.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@ Ceph Storage
2121
Ceph Ansible
2222
============
2323

24-
.. include:: ceph_ansible.rst
24+
.. include:: include/ceph_ansible.rst
2525

2626
Ceph Troubleshooting
2727
====================
2828

29-
.. include:: ceph_troubleshooting.rst
29+
.. include:: include/ceph_troubleshooting.rst

source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@
3939
# List of patterns, relative to source directory, that match files and
4040
# directories to ignore when looking for source files.
4141
# This pattern also affects html_static_path and html_extra_path.
42-
exclude_patterns = []
42+
exclude_patterns = ['include/*']
4343

4444

4545
# -- Options for HTML output -------------------------------------------------

source/data/deployment.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,12 @@ ceph_ansible: false
77

88
# Whether the Ceph deployment is managed by StackHPC
99
ceph_managed: false
10+
11+
# Whether the OpenStack deployment includes Ironic for bare metal compute.
12+
ironic: false
13+
14+
# Whether Ironic automated cleaning is enabled.
15+
ironic_automated_cleaning: true
16+
17+
# Whether Kayobe manages physical network devices.
18+
kayobe_manages_physical_network: true
Lines changed: 268 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,268 @@
1+
.. _ironic-node-lifecycle:
2+
3+
Ironic node life cycle
4+
----------------------
5+
6+
The deployment process is documented in the `Ironic User Guide <https://docs.openstack.org/ironic/wallaby/user/index.html>`__.
7+
The |project_name| OpenStack deployment uses the
8+
`direct deploy method <https://docs.openstack.org/ironic/wallaby/user/index.html#example-1-pxe-boot-and-direct-deploy-process>`__.
9+
10+
The Ironic state machine can be found `here <https://docs.openstack.org/ironic/latest/user/states.html>`__. The rest of
11+
this documentation refers to these states and assumes that you have familiarity.
12+
13+
High level overview of state transitions
14+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
15+
16+
The following section attempts to describe the state transitions for various Ironic operations at a high level.
17+
It focuses on trying to describe the steps where dynamic switch reconfiguration is triggered.
18+
For a more detailed overview, refer to the :ref:`ironic-node-lifecycle` section.
19+
20+
Provisioning
21+
~~~~~~~~~~~~
22+
23+
Provisioning starts when an instance is created in Nova using a bare metal flavor.
24+
25+
- Node starts in the available state (available)
26+
- User provisions an instance (deploying)
27+
- Ironic will switch the node onto the provisioning network (deploying)
28+
- Ironic will power on the node and will await a callback (wait-callback)
29+
- Ironic will image the node with an operating system using the image provided at creation (deploying)
30+
- Ironic switches the node onto the tenant network(s) via neutron (deploying)
31+
- Transition node to active state (active)
32+
33+
.. _baremetal-management-deprovisioning:
34+
35+
Deprovisioning
36+
~~~~~~~~~~~~~~
37+
38+
Deprovisioning starts when an instance created in Nova using a bare metal flavor is destroyed.
39+
40+
.. ifconfig:: deployment['ironic_automated_cleaning']
41+
42+
Automated cleaning is enabled, and occurs when nodes are deprovisioned.
43+
44+
- Node starts in active state (active)
45+
- User deletes instance (deleting)
46+
- Ironic will remove the node from any tenant network(s) (deleting)
47+
- Ironic will switch the node onto the cleaning network (deleting)
48+
- Ironic will power on the node and will await a callback (clean-wait)
49+
- Node boots into Ironic Python Agent and issues callback, Ironic starts cleaning (cleaning)
50+
- Ironic removes node from cleaning network (cleaning)
51+
- Node transitions to available (available)
52+
53+
.. ifconfig:: not deployment['ironic_automated_cleaning']
54+
55+
Automated cleaning is currently disabled.
56+
57+
- Node starts in active state (active)
58+
- User deletes instance (deleting)
59+
- Ironic will remove the node from any tenant network(s) (deleting)
60+
- Node transitions to available (available)
61+
62+
Cleaning
63+
~~~~~~~~
64+
65+
Manual cleaning is not part of the regular state transitions when using Nova, however nodes can be manually cleaned by administrators.
66+
67+
- Node starts in the manageable state (manageable)
68+
- User triggers cleaning with API (cleaning)
69+
- Ironic will switch the node onto the cleaning network (cleaning)
70+
- Ironic will power on the node and will await a callback (clean-wait)
71+
- Node boots into Ironic Python Agent and issues callback, Ironic starts cleaning (cleaning)
72+
- Ironic removes node from cleaning network (cleaning)
73+
- Node transitions back to the manageable state (manageable)
74+
75+
.. ifconfig:: deployment['ironic_automated_cleaning']
76+
77+
See :ref:`baremetal-management-deprovisioning` for information about
78+
automated cleaning.
79+
80+
Rescuing
81+
~~~~~~~~
82+
83+
Feature not used. The required rescue network is not currently configured.
84+
85+
Baremetal networking
86+
--------------------
87+
88+
Baremetal networking with the Neutron Networking Generic Switch ML2 driver requires a combination of static and dynamic switch configuration.
89+
90+
.. _static-switch-config:
91+
92+
Static switch configuration
93+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
94+
95+
.. ifconfig:: deployment['kayobe_manages_physical_network']
96+
97+
Static physical network configuration is managed via Kayobe.
98+
99+
.. TODO: Fill in the switch configuration
100+
101+
- Some initial switch configuration is required before networking generic switch can take over the management of an interface.
102+
First, LACP must be configured on the switch ports attached to the baremetal node, e.g:
103+
104+
.. code-block:: shell
105+
106+
The interface is then partially configured:
107+
108+
.. code-block:: shell
109+
110+
For :ref:`ironic-node-discovery` to work, you need to manually switch the port to the provisioning network:
111+
112+
.. code-block:: shell
113+
114+
**NOTE**: You only need to do this if Ironic isn't aware of the node.
115+
116+
Configuration with kayobe
117+
^^^^^^^^^^^^^^^^^^^^^^^^^
118+
119+
Kayobe can be used to apply the :ref:`static-switch-config`.
120+
121+
- Upstream documentation can be found `here <https://docs.openstack.org/kayobe/latest/configuration/reference/physical-network.html>`__.
122+
- Kayobe does all the switch configuration that isn't :ref:`dynamically updated using Ironic <dynamic-switch-configuration>`.
123+
- Optionally switches the node onto the provisioning network (when using ``--enable-discovery``)
124+
125+
+ NOTE: This is a dangerous operation as it can wipe out the dynamic VLAN configuration applied by neutron/ironic.
126+
You should only run this when initially enrolling a node. It is possible to use the ``interface-description-limit``. For example:
127+
128+
.. code-block::
129+
130+
kayobe physical network configure --interface-description-limit <description> --group switches --display --enable-discovery
131+
132+
In this example, ``--display`` is used to preview the switch configuration without applying it.
133+
134+
.. TODO: Fill in information about how switches are configured in kayobe-config, with links
135+
136+
- Configuration is done using a combination of ``group_vars`` and ``host_vars``
137+
138+
.. ifconfig:: not deployment['kayobe_manages_physical_network']
139+
140+
.. TODO: Fill in details about how physical network configuration is managed.
141+
142+
Static physical network configuration is not managed via Kayobe.
143+
144+
.. _dynamic-switch-configuration:
145+
146+
Dynamic switch configuration
147+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
148+
149+
Ironic dynamically configures the switches using the Neutron `Networking Generic Switch <https://docs.openstack.org/networking-generic-switch/latest/>`_ ML2 driver.
150+
151+
- Used to toggle the baremetal nodes onto different networks
152+
153+
+ Can use any VLAN network defined in OpenStack, providing that the VLAN has been trunked to the controllers
154+
as this is required for DHCP to function.
155+
+ See :ref:`ironic-node-lifecycle`. This attempts to illustrate when any switch reconfigurations happen.
156+
157+
- Only configures VLAN membership of the switch interfaces or port groups. To prevent conflicts with the static switch configuration,
158+
the convention used is: after the node is in service in Ironic, VLAN membership should not be manually adjusted and
159+
should be left to be controlled by ironic i.e *don't* use ``--enable-discovery`` without a limit when configuring the
160+
switches with kayobe.
161+
- Ironic is configured to use the neutron networking driver.
162+
163+
.. _ngs-commands:
164+
165+
Commands that NGS will execute
166+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
167+
168+
Networking Generic Switch is mainly concerned with toggling the ports onto different VLANs. It
169+
cannot fully configure the switch.
170+
171+
.. TODO: Fill in the switch configuration
172+
173+
- Switching the port onto the provisioning network
174+
175+
.. code-block:: shell
176+
177+
- Switching the port onto the tenant network.
178+
179+
.. code-block:: shell
180+
181+
- When deleting the instance, the VLANs are removed from the port. Using:
182+
183+
.. code-block:: shell
184+
185+
NGS will save the configuration after each reconfiguration (by default).
186+
187+
Ports managed by NGS
188+
^^^^^^^^^^^^^^^^^^^^
189+
190+
The command below extracts a list of port UUID, node UUID and switch port information.
191+
192+
.. code-block:: bash
193+
194+
admin# openstack baremetal port list --field uuid --field node_uuid --field local_link_connection --format value
195+
196+
NGS will manage VLAN membership for ports when the ``local_link_connection`` fields match one of the switches in ``ml2_conf.ini``.
197+
The rest of the switch configuration is static.
198+
The switch configuration that NGS will apply to these ports is detailed in :ref:`dynamic-switch-configuration`.
199+
200+
.. _ironic-node-discovery:
201+
202+
Ironic node discovery
203+
---------------------
204+
205+
Discovery is the process of PXE booting the nodes into the Ironic Python Agent (IPA) ramdisk. This ramdisk will collect hardware and networking configuration from the node in a process known as introspection. This data is used to populate the baremetal node object in Ironic. The series of steps you need to take to enrol a new node is as follows:
206+
207+
- Configure credentials on the |bmc|. These are needed for Ironic to be able to perform power control actions.
208+
209+
- Controllers should have network connectivity with the target |bmc|.
210+
211+
.. ifconfig:: deployment['kayobe_manages_physical_network']
212+
213+
- Add any additional switch configuration to kayobe config.
214+
The minimal switch configuration that kayobe needs to know about is described in :ref:`tor-switch-configuration`.
215+
216+
- Apply any :ref:`static switch configration <static-switch-config>`. This performs the initial
217+
setup of the switchports that is needed before Ironic can take over. The static configuration
218+
will not be modified by Ironic, so it should be safe to reapply at any point. See :ref:`ngs-commands`
219+
for details about the switch configuation that Networking Generic Switch will apply.
220+
221+
.. ifconfig:: deployment['kayobe_manages_physical_network']
222+
223+
- Put the node onto the provisioning network by using the ``--enable discovery`` flag. See :ref:`static-switch-config`.
224+
225+
* This is only necessary to initially discover the node. Once the node is in registered in Ironic,
226+
it will take over control of the the VLAN membership. See :ref:`dynamic-switch-configuration`.
227+
228+
* This provides ethernet connectivity with the controllers over the `workload provisioning` network
229+
230+
.. ifconfig:: not deployment['kayobe_manages_physical_network']
231+
232+
- Put the node onto the provisioning network.
233+
234+
.. TODO: link to the relevant file in kayobe config
235+
236+
- Add node to the kayobe inventory.
237+
238+
.. TODO: Fill in details about necessary BIOS & RAID config
239+
240+
- Apply any necesary BIOS & RAID configuration.
241+
242+
.. TODO: Fill in details about how to trigger a PXE boot
243+
244+
- PXE boot the node.
245+
246+
.. _tor-switch-configuration:
247+
248+
Top of Rack (ToR) switch configuration
249+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
250+
251+
Networking Generic Switch must be aware of the Top-of-Rack switch connected to the new node.
252+
Switches managed by NGS are configured in ``ml2_conf.ini``.
253+
254+
.. TODO: Fill in details about how switches are added to NGS config in kayobe-config
255+
256+
After adding switches to the NGS configuration, Neutron must be redeployed.
257+
258+
Considerations when booting baremetal compared to VMs
259+
------------------------------------------------------
260+
261+
- You can only use networks of type: vlan
262+
- Without using trunk ports, it is only possible to directly attach one network to each port or port group of an instance.
263+
264+
* To access other networks you can use routers
265+
* You can still attach floating IPs
266+
267+
- Instances take much longer to provision (expect at least 15 mins)
268+
- When booting an instance use one of the flavors that maps to a baremetal node via the RESOURCE_CLASS configured on the flavor.
File renamed without changes.

0 commit comments

Comments
 (0)