Skip to content

Commit 85908fc

Browse files
authored
Merge pull request #41167 from skrthomas/pr38580
OSDOCS-2681: Fixes added to #38580
2 parents f5b91b6 + e943527 commit 85908fc

30 files changed

+1036
-64
lines changed

_topic_maps/_topic_map.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1121,8 +1121,14 @@ Topics:
11211121
File: metallb-operator-install
11221122
- Name: Configuring MetalLB address pools
11231123
File: metallb-configure-address-pools
1124+
- Name: Configuring MetalLB BGP peers
1125+
File: metallb-configure-bgp-peers
1126+
- Name: Configuring MetalLB BFD profiles
1127+
File: metallb-configure-bfd-profiles
11241128
- Name: Configuring services to use MetalLB
11251129
File: metallb-configure-services
1130+
- Name: MetalLB troubleshooting and support
1131+
File: metallb-troubleshoot-support
11261132
- Name: Associating secondary interfaces metrics to network attachments
11271133
File: associating-secondary-interfaces-metrics-to-network-attachments
11281134
---

images/209_OpenShift_BGP_0122.png

111 KB
Loading
Lines changed: 49 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,15 @@
1+
12
:_content-type: CONCEPT
3+
// Module included in the following assemblies:
4+
//
5+
// * networking/metallb/metallb-configure-address-pools.adoc
26
[id="nw-metallb-addresspool-cr_{context}"]
37
= About the address pool custom resource
48

59
The fields for the address pool custom resource are described in the following table.
610

711
.MetalLB address pool custom resource
8-
[cols="1,1,3", options="header"]
12+
[cols="1,1,3a", options="header"]
913
|===
1014

1115
|Field
@@ -26,7 +30,7 @@ Specify the same namespace that the MetalLB Operator uses.
2630
|`spec.protocol`
2731
|`string`
2832
|Specifies the protocol for announcing the load balancer IP address to peer nodes.
29-
The only supported value is `layer2`.
33+
Specify `layer2` or `bgp`.
3034

3135
|`spec.autoAssign`
3236
|`boolean`
@@ -40,31 +44,48 @@ The default value is `true`.
4044
You can specify multiple ranges in a single pool.
4145
Specify each range in CIDR notation or as starting and ending IP addresses separated with a hyphen.
4246

47+
|`spec.bgpAdvertisements`
48+
|`object`
49+
|Optional: By default, BGP mode advertises each allocated load-balancer IP address to the configured peers with no additional BGP attributes.
50+
The peer routers receive one `/32` route for each service IP address, with the BGP local preference set to zero and no BGP communities.
51+
Use this field to create custom advertisements.
52+
53+
|===
54+
55+
The fields for the `bgpAdvertisements` object are defined in the following table:
56+
57+
.BGP advertisements configuration
58+
[cols="1,1,3a", options="header"]
4359
|===
4460

45-
////
46-
.Address pool object
47-
[source,yaml]
48-
----
49-
apiVersion: metallb.io/v1alpha1
50-
kind: AddressPool
51-
metadata:
52-
name: <pool_name> <.>
53-
namespace: metallb-system <.>
54-
spec:
55-
protocol: <protocol_type> <.>
56-
autoAssign: true <.>
57-
addresses: <.>
58-
- <range_or_CIDR>
59-
...
60-
----
61-
<.> Specify the name for the address pool. When you add a service, you can specify this pool name in the `metallb.universe.tf/address-pool` annotation to select an IP address from a specific pool.
62-
63-
<.> Specify the namespace for the address pool.
64-
65-
<.> Specify the protocol for announcing the load balancer IP address to peer nodes. The only supported value is `layer2`.
66-
67-
<.> Optional: Specify whether MetalLB automatically assigns IP addresses from this pool. Specify `false` if you want explicitly request an IP address from this pool with the `metallb.universe.tf/address-pool` annotation. The default value is `true`.
68-
69-
<.> Specify a list of IP addresses for MetalLB to assign to services. You can specify multiple ranges in a single pool. Specify each range in CIDR notation or as starting and ending IP addresses separated with a hyphen.
70-
////
61+
|Field
62+
|Type
63+
|Description
64+
65+
|`aggregationLength`
66+
|`integer`
67+
|Optional: Specifies the number of bits to include in a 32-bit CIDR mask.
68+
To aggregate the routes that the speaker advertises to BGP peers, the mask is applied to the routes for several service IP addresses and the speaker advertises the aggregated route.
69+
For example, with an aggregation length of `24`, the speaker can aggregate several `10.0.1.x/32` service IP addresses and advertise a single `10.0.1.0/24` route.
70+
71+
|`aggregationLengthV6`
72+
|`integer`
73+
|Optional: Specifies the number of bits to include in a 128-bit CIDR mask.
74+
For example, with an aggregation length of `124`, the speaker can aggregate several `fc00:f853:0ccd:e799::x/128` service IP addresses and advertise a single `fc00:f853:0ccd:e799::0/124` route.
75+
76+
|`community`
77+
|`array`
78+
|Optional: Specifies one or more BGP communities.
79+
Each community is specified as two 16-bit values separated by the colon character.
80+
Well-known communities must be specified as 16-bit values:
81+
82+
* `NO_EXPORT`: `65535:65281`
83+
* `NO_ADVERTISE`: `65535:65282`
84+
* `NO_EXPORT_SUBCONFED`: `65535:65283`
85+
86+
|`localPref`
87+
|`integer`
88+
|Optional: Specifies the local preference for this advertisement.
89+
This BGP attribute applies to BGP sessions within the Autonomous System.
90+
91+
|===
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * networking/metallb/metallb-configure-bfd-profiles.adoc
4+
5+
[id="nw-metallb-bfdprofile-cr_{context}"]
6+
= About the BFD profile custom resource
7+
8+
The fields for the BFD profile custom resource are described in the following table.
9+
10+
.BFD profile custom resource
11+
[cols="1,1,3a",options="header"]
12+
|===
13+
14+
|Field
15+
|Type
16+
|Description
17+
18+
|`metadata.name`
19+
|`string`
20+
|Specifies the name for the BFD profile custom resource.
21+
22+
|`metadata.namespace`
23+
|`string`
24+
|Specifies the namespace for the BFD profile custom resource.
25+
26+
|`spec.detectMultiplier`
27+
|`integer`
28+
|Specifies the detection multiplier to determine packet loss.
29+
The remote transmission interval is multiplied by this value to determine the connection loss detection timer.
30+
31+
For example, when the local system has the detect multiplier set to `3` and the remote system has the transmission interval set to `300`, the local system detects failures only after `900` ms without receiving packets.
32+
33+
The range is `2` to `255`.
34+
The default value is `3`.
35+
36+
|`spec.echoMode`
37+
|`boolean`
38+
|Specifies the echo transmission mode.
39+
If you are not using distributed BFD, echo transmission mode works only when the peer is also FRR.
40+
The default value is `false` and echo transmission mode is disabled.
41+
42+
When echo transmission mode is enabled, consider increasing the transmission interval of control packets to reduce bandwidth usage.
43+
For example, consider increasing the transmit interval to `2000` ms.
44+
45+
|`spec.echoInterval`
46+
|`integer`
47+
|Specifies the minimum transmission interval, less jitter, that this system uses to send and receive echo packets.
48+
The range is `10` to `60000`.
49+
The default value is `50` ms.
50+
51+
|`spec.minimumTtl`
52+
|`integer`
53+
|Specifies the minimum expected TTL for an incoming control packet.
54+
This field applies to multi-hop sessions only.
55+
56+
The purpose of setting a minimum TTL is to make the packet validation requirements more stringent and avoid receiving control packets from other sessions.
57+
58+
The default value is `254` and indicates that the system expects only one hop between this system and the peer.
59+
60+
|`spec.passiveMode`
61+
|`boolean`
62+
|Specifies whether a session is marked as active or passive.
63+
A passive session does not attempt to start the connection.
64+
Instead, a passive session waits for control packets from a peer before it begins to reply.
65+
66+
Marking a session as passive is useful when you have a router that acts as the central node of a star network and you want to avoid sending control packets that you do not need the system to send.
67+
68+
The default value is `false` and marks the session as active.
69+
70+
|`spec.receiveInterval`
71+
|`integer`
72+
|Specifies the minimum interval that this system is capable of receiving control packets.
73+
The range is `10` to `60000`.
74+
The default value is `300` ms.
75+
76+
|`spec.transmitInterval`
77+
|`integer`
78+
|Specifies the minimum transmission interval, less jitter, that this system uses to send control packets.
79+
The range is `10` to `60000`.
80+
The default value is `300` ms.
81+
82+
|===
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * networking/metallb/about-metallb.adoc
4+
5+
[id="nw-metallb-bgp-limitations_{context}"]
6+
= Limitations for BGP mode
7+
8+
[id="nw-metallb-bgp-limitations-break-connections_{context}"]
9+
== Node failure can break all active connections
10+
11+
MetalLB shares a limitation that is common to BGP-based load balancing.
12+
When a BGP session terminates, such as when a node fails or when a `speaker` pod restarts, the session termination might result in resetting all active connections.
13+
End users can experience a `Connection reset by peer` message.
14+
15+
The consequence of a terminated BGP session is implementation-specific for each router manufacturer.
16+
However, you can anticipate that a change in the number of `speaker` pods affects the number of BGP sessions and that active connections with BGP peers will break.
17+
18+
To avoid or reduce the likelihood of a service interruption, you can specify a node selector when you add a BGP peer.
19+
By limiting the number of nodes that start BGP sessions, a fault on a node that does not have a BGP session has no affect on connections to the service.
20+
21+
[id="nw-metallb-bgp-limitations-communities-values_{context}"]
22+
== Communities are specified as 16-bit values
23+
24+
Communities are specified as part of an address pool custom resource and are specified as 16-bit values separated by a colon.
25+
For example, to specify that load balancer IP addresses are advertised with the well-known `NO_ADVERTISE` community attribute, use notation like the following:
26+
27+
[source,yaml]
28+
----
29+
apiVersion: metallb.io/v1beta1
30+
kind: AddressPool
31+
metadata:
32+
name: doc-example-no-advertise
33+
namespace: metallb-system
34+
spec:
35+
protocol: bgp
36+
addresses:
37+
- 192.168.1.100-192.168.1.255
38+
bgpAdvertisements:
39+
- communities:
40+
- 65535:65282
41+
----
42+
43+
The limitation that communities are only specified as 16-bit values is a difference with the community-supported implementation of MetalLB that supports a `bgp-communities` field and readable names for BGP communities.
44+
45+
[id="nw-metallb-bgp-limitations-single-asn_{context}"]
46+
== Support for a single ASN and a single router ID only
47+
48+
When you add a BGP peer custom resource, you specify the `spec.myASN` field to identify the Autonomous System Number (ASN) that MetalLB belongs to.
49+
{product-title} uses an implementation of BGP with MetalLB that requires MetalLB to belong to a single ASN.
50+
If you attempt to add a BGP peer and specify a different value for `spec.myASN` than an existing BGP peer custom resource, you receive an error.
51+
52+
Similarly, when you add a BGP peer custom resource, the `spec.routerID` field is optional.
53+
If you specify a value for this field, you must specify the same value for all other BGP peer custom resources that you add.
54+
55+
The limitation to support a single ASN and single router ID is a difference with the community-supported implementation of MetalLB.

modules/nw-metallb-bgp.adoc

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * networking/metallb/about-metallb.adoc
4+
5+
[id="nw-metallb-bgp_{context}"]
6+
= MetalLB concepts for BGP mode
7+
8+
In BGP mode, each `speaker` pod advertises the load balancer IP address for a service to each BGP peer.
9+
BGP peers are commonly network routers that are configured to use the BGP protocol.
10+
When a router receives traffic for the load balancer IP address, the router picks one of the nodes with a `speaker` pod that advertised the IP address.
11+
The router sends the traffic to that node.
12+
After traffic enters the node, the service proxy for the CNI network provider distributes the traffic to all the pods for the service.
13+
14+
The directly-connected router on the same layer 2 network segment as the cluster nodes can be configured as a BGP peer.
15+
If the directly-connected router is not configured as a BGP peer, you need to configure your network so that packets for load balancer IP addresses are routed between the BGP peers and the cluster nodes that run the `speaker` pods.
16+
17+
Each time a router receives new traffic for the load balancer IP address, it creates a new connection to a node.
18+
Each router manufacturer has an implementation-specific algorithm for choosing which node to initiate the connection with.
19+
However, the algorithms commonly are designed to distribute traffic across the available nodes for the purpose of balancing the network load.
20+
21+
If a node becomes unavailable, the router initiates a new connection with another node that has a `speaker` pod that advertises the load balancer IP address.
22+
23+
.MetalLB topology diagram for BGP mode
24+
image::209_OpenShift_BGP_0122.png["Speaker pods on host network 10.0.1.0/24 use BGP to advertise the load balancer IP address, 203.0.113.200, to a router."]
25+
26+
The preceding graphic shows the following concepts related to MetalLB:
27+
28+
* An application is available through a service that has an IPv4 cluster IP on the `172.130.0.0/16` subnet.
29+
That IP address is accessible from inside the cluster.
30+
The service also has an external IP address that MetalLB assigned to the service, `203.0.113.200`.
31+
32+
* Nodes 2 and 3 have a pod for the application.
33+
34+
* The `speaker` daemon set runs a pod on each node.
35+
The MetalLB Operator starts these pods.
36+
You can configure MetalLB to specify which nodes run the `speaker` pods.
37+
38+
* Each `speaker` pod is a host-networked pod.
39+
The IP address for the pod is identical to the IP address for the node on the host network.
40+
41+
* Each `speaker` pod starts a BGP session with all BGP peers and advertises the load balancer IP addresses or aggregated routes to the BGP peers.
42+
The `speaker` pods advertise that they are part of Autonomous System 65010.
43+
The diagram shows a router, R1, as a BGP peer within the same Autonomous System.
44+
However, you can configure MetalLB to start BGP sessions with peers that belong to other Autonomous Systems.
45+
46+
* All the nodes with a `speaker` pod that advertises the load balancer IP address can receive traffic for the service.
47+
48+
** If the external traffic policy for the service is set to `cluster`, then all the `speaker` pods advertise the `203.0.113.200` load balancer IP address and all the nodes with a `speaker` pod can receive traffic for the service.
49+
At least one endpoint for the service must be in the `Ready` condition. The host prefix is advertised to the router peer only if the external traffic policy is set to `cluster`.
50+
51+
** If the external traffic policy for the service is set to `local`, then only the `speaker` pods that are on the same node as an endpoint in the `Ready` condition for the service can advertise the load balancer IP address. Only those nodes can receive traffic for the service.
52+
In the preceding graphic, nodes 2 and 3 would advertise `203.0.113.200`.
53+
54+
* You can configure MetalLB to control which `speaker` pods start BGP sessions with specific BGP peers by specifying a node selector when you add a BGP peer custom resource.
55+
56+
* Any routers, such as R1, that are configured to use BGP can be set as BGP peers.
57+
58+
* Client traffic is routed to one of the nodes on the host network.
59+
After traffic enters the node, the service proxy sends the traffic to the application pod on the same node or another node according to the external traffic policy that you set for the service.
60+
61+
* If a node becomes unavailable, the router detects the failure and initiates a new connection with another node.
62+
You can configure MetalLB to use a Bidirectional Forwarding Detection (BFD) profile for BGP peers.
63+
BFD provides faster link failure detection so that routers can initiate new connections earlier than without BFD.

0 commit comments

Comments
 (0)