Skip to content

Commit 1588ebd

Browse files
authored
Merge pull request #109 from stackhpc/upstream/2023.1-2024-01-08
Synchronise 2023.1 with upstream
2 parents 7212aa9 + 54acc8a commit 1588ebd

40 files changed

+1468
-573
lines changed

doc/source/admin/ovn/external_ports.rst

Lines changed: 155 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ML2/OVN leverages the use of OVN's `external ports
1010
feature.
1111

1212
What is it
13-
~~~~~~~~~~
13+
----------
1414

1515
The external ports feature in OVN allows for setting up a port that lives
1616
externally to the instance and is reponsible for replying to ARP requests
@@ -28,18 +28,107 @@ following VNICs:
2828
* macvtap
2929
* baremetal
3030

31-
Also, ports of the type ``external`` will be scheduled on the gateway
32-
nodes (controller or networker nodes) in HA mode by the OVN Neutron
33-
driver. Check the `OVN Database information`_ section for more
34-
information.
31+
These ports can be listed in OVN with following command:
3532

36-
OVN Database information
37-
~~~~~~~~~~~~~~~~~~~~~~~~
33+
.. code-block:: bash
34+
35+
$ ovn-nbctl find Logical_Switch_Port type=external
36+
_uuid : 105e83ae-252d-401b-a1a7-8d28ec28a359
37+
ha_chassis_group : [43047e7b-4c78-4984-9788-6263fcc69885]
38+
type : external
39+
...
40+
41+
.. end
42+
43+
The next section will talk more about the different configurations for
44+
scheduling these ports and how they are represented in the OVN database.
45+
46+
Scheduling and database information
47+
-----------------------------------
48+
49+
Ports of the type ``external`` will be scheduled on nodes
50+
marked to host these type of ports via the `ovn-cms-options
51+
<http://www.ovn.org/support/dist-docs/ovn-controller.8.html>`_
52+
configuration. There are two supported configurations for these nodes:
53+
54+
1. ``enable-chassis-as-extport-host``
55+
2. ``enable-chassis-as-gw``
56+
57+
These options can be set by running the following command locally on each
58+
node that will act as a candidate to host these ports:
59+
60+
.. code-block:: bash
3861
39-
the ML2/OVN driver identifies a gateway node by the
40-
``ovn-cms-options=enable-chassis-as-gw`` and ``ovn-bridge-mappings``
41-
options in the external_ids column from the ``Chassis`` table in the
42-
OVN Southbound database:
62+
$ ovs-vsctl set Open_vSwitch . external-ids:ovn-cms-options=\"enable-chassis-as-extport-host\"
63+
64+
$ ovs-vsctl set Open_vSwitch . external-ids:ovn-cms-options=\"enable-chassis-as-gw\"
65+
66+
.. end
67+
68+
The sections below will explain the differences between the two
69+
configuration values.
70+
71+
Configuration: ``enable-chassis-as-extport-host``
72+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
73+
74+
When nodes in the cluster are marked with the
75+
``enable-chassis-as-extport-host`` configuration, the ML2/OVN driver
76+
will schedule the external ports onto these nodes. This configuration
77+
takes precedence over ``enable-chassis-as-gw``.
78+
79+
With this configuration, the ML2/OVN driver will create one
80+
``HA_Chassis_Group`` per external port and it will be named as
81+
``neutron-extport-<Neutron Port UUID>``. For example:
82+
83+
.. code-block:: bash
84+
85+
$ ovn-sbctl list Chassis
86+
_uuid : fa24d475-9664-4a62-bb1c-52a6fa4966f7
87+
external_ids : {ovn-cms-options=enable-chassis-as-extport-host, ...}
88+
hostname : compute-0
89+
name : "6fd9cef6-4e9d-4bde-ab82-016c2461957b"
90+
...
91+
_uuid : a29ee8f6-5301-45f5-b280-a43e533d4d65
92+
external_ids : {ovn-cms-options=enable-chassis-as-extport-host, ...}
93+
hostname : compute-1
94+
name : "4fa76c10-c6ea-4ae9-b31c-bc69103fe6f9"
95+
...
96+
97+
.. end
98+
99+
.. code-block:: bash
100+
101+
$ ovn-nbctl list HA_Chassis_Group neutron-extport-392a77f9-7c48-4ad0-bd06-8b55bba00bd1
102+
_uuid : 1249b761-24e3-414e-ae10-7e880e9d3cf8
103+
external_ids : {"neutron:availability_zone_hints"=""}
104+
ha_chassis : [0d6b9718-7718-45d2-a838-1deb40131442, ae6e64e7-f948-49b3-a171-c9cfb58c8b31]
105+
name : neutron-extport-392a77f9-7c48-4ad0-bd06-8b55bba00bd1
106+
107+
.. end
108+
109+
Also, for HA, there will be a limit of five Chassis per
110+
``HA_Chassis_Group``, meaning that even if there are more nodes marked
111+
with the ``enable-chassis-as-extport-host`` option, each group will
112+
contain up to five members. This limit has been imposed because OVN uses
113+
BFD to monitor the connectivity of each member in the group, and having
114+
an unlimited number of members can potentially put a lot of stress on OVN.
115+
116+
In general, this option is used when there are specific requirements
117+
for ``external`` ports and they can not be scheduled on controllers or
118+
gateway nodes. The next configuration does the opposite and uses the
119+
nodes marked as gateway to schedule the ``external`` ports.
120+
121+
Configuration: ``enable-chassis-as-gw``
122+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
123+
124+
For the majority of use cases where there are no special requirements
125+
for the ``external`` ports and they can be co-located with gateway ports,
126+
this configuration should be used.
127+
128+
Gateway nodes are identified by the
129+
``enable-chassis-as-gw`` and `ovn-bridge-mappings
130+
<http://www.ovn.org/support/dist-docs/ovn-controller.8.html>`_
131+
configurations:
43132

44133
.. code-block:: bash
45134
@@ -52,67 +141,81 @@ OVN Southbound database:
52141
53142
.. end
54143
55-
For more information about both of these options, please
56-
take a look at the `ovn-controller documentation
57-
<http://www.ovn.org/support/dist-docs/ovn-controller.8.html>`_.
144+
As mentioned in the `What is it`_ section, every time a Neutron port
145+
with a certain VNIC is created the OVN driver will create a port of the
146+
type ``external`` in the OVN Northbound database.
58147

59-
These options can be set by running the following command locally on each
60-
gateway node (note, the ``ovn-bridge-mappings`` will need to be adapted
61-
to your environment):
148+
When the ``enable-chassis-as-gw`` configuration is used, the ML2/OVN
149+
driver will create one ``HA_Chassis_Group`` per network (instead
150+
of one per external port in the previous case) and it will be named as
151+
``neutron-<Neutron Network UUID>``.
152+
153+
All ``external`` ports belonging to this network will share the same
154+
``HA_Chassis_Group`` and the group is also limited to a maximum of five
155+
members for HA.
62156

63157
.. code-block:: bash
64158
65-
$ ovs-vsctl set Open_vSwitch . external-ids:ovn-cms-options=\"enable-chassis-as-gw\" external-ids:ovn-bridge-mappings=\"public:br-ex\"
159+
$ ovn-nbctl list HA_Chassis_Group
160+
_uuid : 43047e7b-4c78-4984-9788-6263fcc69885
161+
external_ids : {"neutron:availability_zone_hints"=""}
162+
ha_chassis : [3005bf84-fc95-4361-866d-bfa1c980adc8, 72c7671e-dd48-4100-9741-c47221672961]
163+
name : neutron-4b2944ca-c7a3-4cf6-a9c8-6aa541a20535
66164
67165
.. end
68166
69-
As mentioned in the `What is it`_ section, every time a Neutron port
70-
with a certain VNIC is created the OVN driver will create a port of the
71-
type ``external`` in the OVN Northbound database. These ports can be
72-
found by issuing the following command:
167+
High availability
168+
-----------------
169+
170+
As hinted above, the ML2/OVN driver does provide high availability to the
171+
``external`` ports. This is done via the ``HA_Chassis_Group`` mechanism
172+
from OVN.
173+
174+
On every ``external`` port there will be a column called
175+
``ha_chassis_group`` which points to the ``HA_Chassis_Group`` that the
176+
port belongs to:
73177

74178
.. code-block:: bash
75179
76-
$ ovn-nbctl find Logical_Switch_Port type=external
77-
_uuid : 105e83ae-252d-401b-a1a7-8d28ec28a359
78-
ha_chassis_group : [43047e7b-4c78-4984-9788-6263fcc69885]
79-
type : external
80-
...
180+
$ ovn-nbctl find logical_switch_port type=external
181+
ha_chassis_group : 924fd0fe-3e84-4eaa-aa1d-41103ec511e5
182+
name : "287040d6-0936-4363-ae0a-2d5a239e55fa"
183+
type : external
184+
...
81185
82186
.. end
83187
84-
The ``ha_chassis_group`` column indicates which HA Chassis Group that
85-
port belongs to, to find that group do:
188+
In the ``HA_Chassis_Group``, the members of each group are listed in the
189+
``ha_chassis`` column:
86190

87191
.. code-block:: bash
88192
89-
# The UUID is the one from the ha_chassis_group column from
90-
# the Logical_Switch_Port table
91-
$ ovn-nbctl list HA_Chassis_Group 43047e7b-4c78-4984-9788-6263fcc69885
92-
_uuid : 43047e7b-4c78-4984-9788-6263fcc69885
93-
external_ids : {}
94-
ha_chassis : [3005bf84-fc95-4361-866d-bfa1c980adc8, 72c7671e-dd48-4100-9741-c47221672961]
95-
name : neutron-4b2944ca-c7a3-4cf6-a9c8-6aa541a20535
193+
$ ovn-nbctl list HA_Chassis_Group 924fd0fe-3e84-4eaa-aa1d-41103ec511e5
194+
_uuid : 924fd0fe-3e84-4eaa-aa1d-41103ec511e5
195+
external_ids : {"neutron:availability_zone_hints"=""}
196+
ha_chassis : [3005bf84-fc95-4361-866d-bfa1c980adc8, 72c7671e-dd48-4100-9741-c47221672961]
197+
name : neutron-extport-287040d6-0936-4363-ae0a-2d5a239e55fa
96198
97199
.. end
98200
99201
.. note::
100-
The external ports will be placed on a HA Chassis Group for the
101-
network that the port belongs to. Those HA Chassis Groups are named as
102-
``neutron-<Neutron Network UUID>``, as seeing in the output above. You
103-
can also use this "name" with the ``ovn-nbctl list`` command when
104-
searching for a specific HA Chassis Group.
105-
106-
The chassis that are members of the HA Chassis Group are listed in
107-
the ``ha_chassis`` column. Those are the gateway nodes (controller
108-
or networker nodes) in the deployment and it's where the ``external``
109-
ports will be scheduled. In order to find which gateway node the external
110-
ports are scheduled on use the following command:
202+
203+
There will be a maximum of five members for each group, this limit
204+
has been imposed because OVN uses BFD to monitor the connectivity of
205+
each member in the group, and having an unlimited number of members
206+
can potentially put a lot of stress on OVN.
207+
208+
.. end
209+
210+
When listing the members of a group there will be a column called
211+
``priority`` that contains a numerical value, the member with the highest
212+
``priority`` is the chassis where the ports will be scheduled on. OVN
213+
will monitor each member via BFD protocol, and if the chassis that is
214+
hosting the ports goes down, the ports will be automatically scheduled
215+
on the next chassis with the highest priority that is alive.
111216

112217
.. code-block:: bash
113218
114-
# The UUIDs are the UUID members of the HA Chassis Group
115-
# (ha_chassis column from the HA_Chassis_Group table)
116219
$ ovn-nbctl list HA_Chassis 3005bf84-fc95-4361-866d-bfa1c980adc8 72c7671e-dd48-4100-9741-c47221672961
117220
_uuid : 3005bf84-fc95-4361-866d-bfa1c980adc8
118221
chassis_name : "1a462946-ccfd-46a6-8abf-9dca9eb558fb"
@@ -126,11 +229,6 @@ ports are scheduled on use the following command:
126229
127230
.. end
128231
129-
Note the ``priority`` column from the previous command, the chassis with
130-
the highest ``priority`` from that list is the chassis that will have
131-
the external ports scheduled on it. In our example above, the chassis
132-
with the UUID ``1a462946-ccfd-46a6-8abf-9dca9eb558fb`` is the one.
133-
134-
Whenever the chassis with the highest priority goes down, the ports will
135-
be automatically scheduled on the next chassis with the highest priority
136-
which is alive. So, the external ports are HA out of the box.
232+
In the example above, the Chassis with the UUID
233+
``1a462946-ccfd-46a6-8abf-9dca9eb558fb`` is the one that is hosting the
234+
external port ``287040d6-0936-4363-ae0a-2d5a239e55fa``.

neutron/agent/linux/external_process.py

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,19 @@ def enable(self, cmd_callback=None, reload_cfg=False, ensure_active=False):
8787
if not self.active:
8888
if not cmd_callback:
8989
cmd_callback = self.default_cmd_callback
90-
cmd = cmd_callback(self.get_pid_file_name())
90+
# Always try and remove the pid file, as it's existence could
91+
# stop the process from starting
92+
pid_file = self.get_pid_file_name()
93+
try:
94+
utils.delete_if_exists(pid_file, run_as_root=self.run_as_root)
95+
except Exception as e:
96+
LOG.error("Could not delete file %(pid_file)s, %(service)s "
97+
"could fail to start. Exception: %(exc)s",
98+
{'pid_file': pid_file,
99+
'service': self.service,
100+
'exc': e})
101+
102+
cmd = cmd_callback(pid_file)
91103

92104
ip_wrapper = ip_lib.IPWrapper(namespace=self.namespace)
93105
ip_wrapper.netns.execute(cmd, addl_env=self.cmd_addl_env,
@@ -99,12 +111,14 @@ def enable(self, cmd_callback=None, reload_cfg=False, ensure_active=False):
99111

100112
def reload_cfg(self):
101113
if self.custom_reload_callback:
102-
self.disable(get_stop_command=self.custom_reload_callback)
114+
self.disable(get_stop_command=self.custom_reload_callback,
115+
delete_pid_file=False)
103116
else:
104-
self.disable('HUP')
117+
self.disable('HUP', delete_pid_file=False)
105118

106-
def disable(self, sig='9', get_stop_command=None):
119+
def disable(self, sig='9', get_stop_command=None, delete_pid_file=True):
107120
pid = self.pid
121+
delete_pid_file = delete_pid_file or sig == '9'
108122

109123
if self.active:
110124
if get_stop_command:
@@ -118,10 +132,10 @@ def disable(self, sig='9', get_stop_command=None):
118132
utils.execute(cmd, addl_env=self.cmd_addl_env,
119133
run_as_root=self.run_as_root,
120134
privsep_exec=True)
121-
# In the case of shutting down, remove the pid file
122-
if sig == '9':
123-
utils.delete_if_exists(self.get_pid_file_name(),
124-
run_as_root=self.run_as_root)
135+
136+
if delete_pid_file:
137+
utils.delete_if_exists(self.get_pid_file_name(),
138+
run_as_root=self.run_as_root)
125139
elif pid:
126140
LOG.debug('%(service)s process for %(uuid)s pid %(pid)d is stale, '
127141
'ignoring signal %(signal)s',

neutron/agent/linux/keepalived.py

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -427,15 +427,6 @@ def _output_config_file(self):
427427

428428
return config_path
429429

430-
@staticmethod
431-
def _safe_remove_pid_file(pid_file):
432-
try:
433-
os.remove(pid_file)
434-
except OSError as e:
435-
if e.errno != errno.ENOENT:
436-
LOG.error("Could not delete file %s, keepalived can "
437-
"refuse to start.", pid_file)
438-
439430
def get_vrrp_pid_file_name(self, base_pid_file):
440431
return '%s-vrrp' % base_pid_file
441432

@@ -516,9 +507,6 @@ def callback(pid_file):
516507
if vrrp_pm.active:
517508
vrrp_pm.disable()
518509

519-
self._safe_remove_pid_file(pid_file)
520-
self._safe_remove_pid_file(self.get_vrrp_pid_file_name(pid_file))
521-
522510
cmd = ['keepalived', '-P',
523511
'-f', config_path,
524512
'-p', pid_file,

neutron/agent/metadata/driver.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -269,7 +269,13 @@ def spawn_monitored_metadata_proxy(cls, monitor, ns_name, port, conf,
269269
pm = cls._get_metadata_proxy_process_manager(uuid, conf,
270270
ns_name=ns_name,
271271
callback=callback)
272-
pm.enable()
272+
try:
273+
pm.enable()
274+
except exceptions.ProcessExecutionError as exec_err:
275+
LOG.error("Encountered process execution error %(err)s while "
276+
"starting process in namespace %(ns)s",
277+
{"err": exec_err, "ns": ns_name})
278+
return
273279
monitor.register(uuid, METADATA_SERVICE_NAME, pm)
274280
cls.monitors[router_id] = pm
275281

@@ -288,9 +294,8 @@ def destroy_monitored_metadata_proxy(cls, monitor, uuid, conf, ns_name):
288294
pm.pid, SIGTERM_TIMEOUT)
289295
pm.disable(sig=str(int(signal.SIGKILL)))
290296

291-
# Delete metadata proxy config and PID files.
297+
# Delete metadata proxy config.
292298
HaproxyConfigurator.cleanup_config_file(uuid, cfg.CONF.state_path)
293-
linux_utils.delete_if_exists(pm.get_pid_file_name(), run_as_root=True)
294299

295300
cls.monitors.pop(uuid, None)
296301

0 commit comments

Comments
 (0)