-
Notifications
You must be signed in to change notification settings - Fork 635
🐛 Reconcile target groups and listeners as their own entities #5004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If a load balancer is created but the calls to create a target group or listener fails (for example, due to a rate limit), creating the group or listener is never retried. This results in a load balancer that's created, but there is nothing attached to it, making it useless. Signed-off-by: Nolan Brubaker <[email protected]>
|
Skipping CI for Draft Pull Request. |
Signed-off-by: Nolan Brubaker <[email protected]>
Signed-off-by: Nolan Brubaker <[email protected]>
|
/test ? |
|
@nrb: The following commands are available to trigger required jobs:
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test pull-cluster-api-provider-aws-e2e |
|
/retitle 🐛 Reconcile target groups and listeners as their own entities |
|
/assign @damdo |
Signed-off-by: Nolan Brubaker <[email protected]>
|
/test pull-cluster-api-provider-aws-e2e |
* reconcileTargetGroupsAndListeners needs both the desired specification as well as the ARN assigned by the AWS API calls. * Rename `spec` in reconcileV2LB to `desiredLB` to clarify what data is actually held in it. Signed-off-by: Nolan Brubaker <[email protected]>
|
/test pull-cluster-api-provider-aws-e2e @r4f4 I clarified some of the variables and updated the arguments for |
|
LGTM. I'll give it a try again. |
|
Got a successful install this time. |
|
@r4f4: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, just a question/comment.
| if len(existingTargetGroups.TargetGroups) == len(existingListeners.Listeners) && len(existingListeners.Listeners) == len(spec.ELBListeners) { | ||
| return existingTargetGroups.TargetGroups, existingListeners.Listeners, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a comment on why this is necessary? Is this an optimization?
Also is this condition always a guarantee there is nothing to do?
Especially looking at, could there be a case where the count is off and doing g.TargetGroupName == ln.TargetGroup.Name naming matching would be better?
len(existingTargetGroups.TargetGroups) == len(existingListeners.Listeners)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was an attempt to optimize so we don't go through the full loop every time.
Matching on the name might be better. I suppose it's possible that a user could create more target groups in AWS than CAPA knows about, though I'm not sure why they'd do that.
In any case, if we check that the target group names line up between AWS and our spec, we're still repeating the loop every time. The main benefit there would be that we could potentially skip doing more API calls within a single iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One reason we saw the LB creation fail was because of API call rate limiting in very busy envs, so avoiding extra calls when possible would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking this through some more, I think the only numeric comparison we can do that's valid all the time is if len(existingTargetGroups) < len(desiredTargetGroups) { reconcile }. Even if the lists are equal in length, it doesn't guarantee that they're the exact same names.
I'll look through this and see if I can do any further optimizations, but I'll at least remove this length check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the productive back-and-forth here.
Please consider a future PR for optimizations, because this PR is already complex (although the new tests are very helpful, they themselves are complex). 🙏
|
/milestone v2.5.1 |
| spec func(spec infrav1.LoadBalancer) infrav1.LoadBalancer | ||
| }{ | ||
| { | ||
| name: "main create flow", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making this more descriptive
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dlipovetsky The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/milestone v2.6.0 |
When the optimization was enabled, we were missing a bunch of creation calls that need to get defined. Add some partial matcher support in order to allow for generated names. Signed-off-by: Nolan Brubaker <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold
@nrb feel free to unhold once happy with it merging
|
/test pull-cluster-api-provider-aws-e2e |
|
/unhold |
Signed-off-by: Nolan Brubaker [email protected]
What type of PR is this?
/kind bug
What this PR does / why we need it:
If a load balancer is created but the calls to create a target group or
listener fails (for example, due to a rate limit), creating the group or
listener is never retried. This results in a load balancer that's
created, but there is nothing attached to it, making it useless.
This change reconciles the target groups and listeners as their own entities, so that even if they fail to create initially, we'll keep retrying.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #5002
Special notes for your reviewer:
Checklist:
Release note: