-
Notifications
You must be signed in to change notification settings - Fork 452
RHEL-97088: set slaves down before bond reactivation #5229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some Cisco switches disable peer ports upon receiving LACP packets with agg=0 during negotiation, disrupting the network. Cisco attributes this behavior to ambiguity in the IEEE spec. Such packets are sent when slaves are removed during NetworkManager bond reactivation. To prevent this, set slaves down before reactivation. This is a safe change, as NetworkManager bond reactivation will re-add and bring them up. Signed-off-by: Xin Long <[email protected]>
|
@lxin: This pull request references RHEL-97088 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.20.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Hi @lxin. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: lxin The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
So throughout the existing code there is a very clear intention to not deactivate bond slaves. The reason for this is because we have had to deal with races in NetworkManager when we inadvertently cause multiple slaves to recycle at the same time where the bond ended up inactive. Forcefully deactivating a slave, while achievable with a one line change as shown on this PR, really makes a lot of the other surrounding things being said and done meaningless:
Other comments that I have:
This feels like a hack (just as what we already have in place, but that makes me so less inclined to change it) to fix a specific problem that might have an overall impact and does not consider the general approach of the surrounding code. We have other approaches available for customers to configure their own network (instead of relying on this script) that seem more appropriate to address customer specific circumstances. The primary reason we are providing that alternative is to avoid overloading this script which has clearly reached its limits and we are looking forward to phase out. Continuing to address specific circumstances as if we could actually manage them for a customer here is not our preferred way forward either. I know that we have some test coverage for bonds, but I am not sure how good it is and how would we go about verifying this type of change. That is another reason I don't feel inclined to change this just to address a specific manufacturer issue. But @rbbratta knows more about this so I will let him speak up. So my recap is:
cc @cybertron |
|
Hi, I'm not very familiar with this script, I am describing what I have found and discussed with Xin and bringing the NM perspective. Sorry for the long comment, but I wanted to try explain clearly how (I think) the patch works. -- What happens without this patch is the following. Initially there is e.g. a bond Eventually, When NM is asked to reactivate As mention by Xin, the switch reacts to the LACP packet by completely disabling the port, which is a behavior we want to prevent. -- So the proposed solution is this. Before (re-)activating the OVS bridge (and so, also the bond and its port), we manually set the port link down with iproute2. Note that this doesn't have any effect on NM. The old profile stays up according to NM, it's only a change that affects the kernel link. The only difference is that when NM reactivates the bond and detaches the ethernet port, now the kernel doesn't send the LACP packet because the interface is down. The rest of the procedure is the same, i.e. NM brings up the port and reattaches it to the bond.
NM is a high level tool that allows users to configure interfaces in term of connection profiles. When doing so, it performs the needed low-level operations (e.g. netlink, sysfs) in a predefined order. It doesn't allow users to control the order of such operations like with iproute2.
This is a good question. NM doesn't bring the link back automatically. The patch expects that some action later in
However, given my little understanding of the script, I am not sure the two scenarios above cover all the possibilities. @jcaamano, @cybertron, do you know? Are all slave connection supposed to be brought up again in the function? If not, the patch can possibly break things because the link will stay down forever.
As I said before, bringing down the interface with
See the previous reply, there is no change of profile. |
|
Thanks for chiming in @bengal.
We are not really sure of the state of a connection when the change on this PR brings the link down. Let's say it can be in any state (not active yet but to be activated, activating, active); you don't foresee any internal issues with NM regardless of the state NM might be in with respect to this connection when the link is set down?
Master would be inactive when this script runs on boot, but the script could be ran at any other time when master is already active and then this wouldn't happen. As previously mentioned, the general tone of this script is to be the least disruptive possible with NM because we have faced issues in the past that for us were difficult to understand and fix. |
Are you referring to this as us re-activating the bond after the activation of the OVS bridge has already activated it? This shouldn't be happening. Or are you referring to the activation of a bond with a different profile that happens when the OVS bridge is activated? |
If the master is going to be activated later and has "autoconnect-slaves=yes", I don't foresee any issues. The master will always bring up the slave, no matter what state it is in. However, from the following discussion, it seems that there is no guarantee that the master will go through another activation if it's already connected. I think this can make the patch unreliable, because the slave will stay with the link down.
Yes, I think that's a problem. |
|
Aside form the problems here, this script might not be the only place where the state of the bond is changed. A user could use knmstate to reconfigure different aspects of the bridge interface (DNS, MTU, ...) that could potentially cause a similar issue. Another problem that I thought about: what if one of the slaves ends up not being active for whatever other reason? How would the external switch notice if that packet is not being sent? |
Why does this not trigger the kernel to send the agg 0 packet?
I don't think we need to worry about that. We don't allow knmstate modification of anything associated with br-ex unless the NMState custom br-ex feature is used, in which case configure-ovs does not run. |
|
I think I can answer my own question. Based on this:
I'm guessing that setting the link down with ip doesn't actually detach it from the bond. Which I suppose raises the question of whether the switch will correctly route traffic, but since the switch seems to be misbehaving when we do tell it the interface is no longer part of the bond, I guess it's a wash? |
When the port is detached from the bond, the kernel sends a LACPDU frame to indicate that it is no longer part of the aggregation. Instead, if the port is brought down it stops transmitting any frame immediately without sending any LACPDU. |
|
Close this, as a better fix is created: #5274 |
Some Cisco switches disable peer ports upon receiving LACP packets with agg=0 during negotiation, disrupting the network. Cisco attributes this behavior to ambiguity in the IEEE spec.
Such packets are sent when slaves are removed during NetworkManager bond reactivation. To prevent this, set slaves down before reactivation. This is a safe change, as NetworkManager bond reactivation will re-add and bring them up.