-
Notifications
You must be signed in to change notification settings - Fork 132
Better support for non-deterministic external-names by updating critical annotations #850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f4fb5c6 to
15df832
Compare
|
@twobiers, we are interested in progressing the solution you propose in this PR. Are you currently able to give this attention? If so, would you mind resolving the conflicts and pushing up an updated version? If you aren't currently able to give this attention would you be happy for us to take it over and get it over the line (you'll be credited with the contribution)? |
Signed-off-by: twobiers <[email protected]>
As the UpdateCriticialAnnotations function is now not exclusively called in the creation process, we have to ensure no other fields like the spec are updated, so we don't interfer with the normal LateInitialize logic Signed-off-by: twobiers <[email protected]>
Signed-off-by: twobiers <[email protected]>
Signed-off-by: twobiers <[email protected]>
15df832 to
c14e6da
Compare
|
@jeanduplessis Thanks for your kind comment. I'm able to work on this and have rebased the branch onto master. |
|
Thanks, @twobiers. @erhancagirici will be doing a thorough review in the next few days, with a specific focus on making sure it's compatible with Upjet's async approach. |
|
Closing and reopening to kick the CI |
Signed-off-by: twobiers <[email protected]>
Signed-off-by: twobiers <[email protected]>
erhancagirici
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I've left some comments regarding the mechanics, otherwise LGTM conceptually. As you mentioned in the doc comments, this will be sort of a band-aid until a more async-aware implementation.
| // after the creation, but later as part of an asynchronous process. | ||
| // When Crossplane supports asynchronous creation of resources natively, this logic | ||
| // might not be needed anymore and can be revisited. | ||
| if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a thing to consider (possible nit):
most reconciliation loops will enter this codepath regardless of a need to update critical annotations. I wonder if this one is no-op in terms of k8s API access or brings some extra load on the apiserver.
Especially in async creations with long-running creation times, currently (with no native async support) the external clients return observation.ResourceExists as true to avoid further actions for the observations during creation of the resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should make writing these annotations optional. I'd suggest making them opt-out since that's the safest path. The option would be specified by the provider author - so they can disable these annotations if they know for sure they're not needed (i.e. naming is deterministic, API is strongly consistent). I remember discussing this with someone recently, but can't find a tracking issue.
(If we do make them optional, I think we could do it pretty easily by injecting a no-op implementation of the annotation updater.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should also be straightforward to keep a state in the reconciler whether a critical annotation was added/changed and depending on that update the annotations or perform a no-op.
On the other hand I'm wondering whether this kind of optimization is needed for "Critical" annotations. Shouldn't the priority be here to add those annotations? Otherwise it would leave room for errors.
| return err | ||
| } | ||
|
|
||
| err = u.client.Patch(ctx, o, client.RawPatch(types.MergePatchType, patchData), client.FieldOwner(fieldOwnerAPISimpleRefResolver), client.ForceOwnership) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we patch all annotations here.
Just wondering how does this interact with the field ownerships of annotations that XP does not manage (like a custom annotation set by a user or some other controllers, tooling etc), and whether it is a thing to worry about regarding server-side apply etc. Would we cause a race regarding field ownerships?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ulucinar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @twobiers for working on this issue. Left some comments for us to discuss.
| err := retry.OnError(retry.DefaultRetry, func(err error) bool { | ||
| return !errors.Is(err, context.Canceled) | ||
| }, func() error { | ||
| err := u.client.Update(ctx, o) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this update operation is being replaced by SSA.
| return err | ||
| } | ||
|
|
||
| err = u.client.Patch(ctx, o, client.RawPatch(types.MergePatchType, patchData), client.FieldOwner(fieldOwnerAPISimpleRefResolver), client.ForceOwnership) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had better use a different manager name than managed.crossplane.io/api-simple-reference-resolver as the annotations have nothing to do with the API resolver. Maybe something like: managed.crossplane.io/critical-annotation-updater?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense for the MR reconciler to just use one manager name, regardless of operation?
Possibly too late if we're already using api-simple-reference-resolver for some.
| err := u.client.Update(ctx, o) | ||
| patchMap := map[string]interface{}{ | ||
| "metadata": map[string]any{ | ||
| "annotations": a, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be more selective in what's being patched here, i.e., not all the annotations on the MR are the critical ones and we would only like to manage the critical annotations by this manager? This will probably not be an issue as this manager will not have an opinion on "non-critical" annotations and their respective managers will dictate their values. But one potential issue is when the other manager (who should really be owning the annotation) actually wants to delete the annotation. The managed.crossplane.io/critical-annotation-updater will still be owning the annotation. So the critical-annotation-updater had better not own non-critical ones...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a reasonable addition for this function to have knowledge about what is a "critical annotation". Currently that comes from the execution order of the reconciler, this way we would make it explicit.
From what I can see, the ciritcal annotations should be:
crossplane.io/external-namecrossplane.io/external-create-pendingcrossplane.io/external-create-succeededcrossplane.io/external-create-failed
Am I missing any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a very naive approach in 11e2136
| if observation.ResourceExists { | ||
| // When a resource exists or is just created, it might have received | ||
| // a non-deterministic external name after its creation, which we need to persist. | ||
| // We do this by updating the critical annotations. | ||
| // This is needed because some resources might not receive an external-name directly | ||
| // after the creation, but later as part of an asynchronous process. | ||
| // When Crossplane supports asynchronous creation of resources natively, this logic | ||
| // might not be needed anymore and can be revisited. | ||
| if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil { | ||
| log.Debug(errUpdateManagedAnnotations, "error", err) | ||
| record.Event(managed, event.Warning(reasonCannotUpdateManaged, errors.Wrap(err, errUpdateManagedAnnotations))) | ||
| return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedAnnotations) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @lsviben explains here, upjet should not be relying on setting ResourceLateInitialized to get the critical annotations updated in the first place. The async mode implemented by upjet breaks the assumptions of the managed reconciler. I believe we need to first address this discrepancy between upjet and the managed reconciler...
Signed-off-by: twobiers <[email protected]>
WalkthroughThe changes refactor the critical annotation update mechanism to use JSON merge patch operations for annotation-only updates rather than modifying entire objects. A new reconciliation block calls this mechanism after resource creation or existence detection. Test scaffolding is extended to mock patch operations across multiple test suites. Changes
Sequence DiagramsequenceDiagram
participant Reconciler
participant UpdateCritical as UpdateCriticalAnnotations
participant Client
participant APIServer
Reconciler->>UpdateCritical: UpdateCriticalAnnotations(obj)
UpdateCritical->>UpdateCritical: Build map of critical<br/>annotations from object
alt No annotations present
UpdateCritical-->>Reconciler: Early return
else Annotations found
UpdateCritical->>Client: Patch(MergePatchType,<br/>metadata.annotations)
alt Patch succeeds
Client->>APIServer: Apply JSON merge patch
APIServer-->>Client: Updated object
Client-->>UpdateCritical: Success
UpdateCritical-->>Reconciler: Return
else Conflict error
Client-->>UpdateCritical: Conflict
UpdateCritical->>Client: Get(latest object)
Client->>APIServer: Fetch latest
APIServer-->>Client: Latest object
Client-->>UpdateCritical: Object data
UpdateCritical->>UpdateCritical: Reapply critical<br/>annotations
UpdateCritical->>Client: Patch(MergePatchType)
Client->>APIServer: Retry merge patch
APIServer-->>Client: Updated
Client-->>UpdateCritical: Success
UpdateCritical-->>Reconciler: Return
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~28 minutes Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/reconciler/managed/api_test.go (1)
610-623: The "Success" test case appears to be misconfigured - it expects an error.The test case named "Success" at lines 610-623 is configured with
MockPatch: test.NewMockPatchFn(errBoom)which returns an error, and the expected result iserr: errors.Wrap(errBoom, errUpdateCriticalAnnotations). This contradicts the test name and reason which states "We should return without error if we successfully update our annotations."This appears to be a copy-paste issue from the "UpdateError" test case above it.
Apply this diff to fix the test:
"Success": { reason: "We should return without error if we successfully update our annotations", c: &test.MockClient{ MockGet: test.NewMockGetFn(nil, setAnnotations), - MockPatch: test.NewMockPatchFn(errBoom), + MockPatch: test.NewMockPatchFn(nil), }, args: args{ o: &fake.LegacyManaged{}, }, want: want{ - err: errors.Wrap(errBoom, errUpdateCriticalAnnotations), + err: nil, o: &fake.LegacyManaged{}, }, },
🧹 Nitpick comments (2)
pkg/reconciler/managed/api.go (1)
58-65: Global variable is intentional for configuration, but consider using constants or a function.The
gochecknoglobalslint warning is expected here. Since this list is read-only and defines critical configuration, it's acceptable as a package-level variable. The list correctly matches the annotations defined inpkg/meta/meta.go.The
gofumptformatting issue on line 58 should be addressed - likely just needs a blank line adjustment or formatting fix.Consider running
gofumpt -won this file to fix the formatting, or alternatively convert this to a function to satisfy the linter:-var ( - criticalAnnotations = []string{ - meta.AnnotationKeyExternalCreateFailed, - meta.AnnotationKeyExternalCreatePending, - meta.AnnotationKeyExternalCreateSucceeded, - meta.AnnotationKeyExternalName, - } -) +func criticalAnnotationKeys() []string { + return []string{ + meta.AnnotationKeyExternalCreateFailed, + meta.AnnotationKeyExternalCreatePending, + meta.AnnotationKeyExternalCreateSucceeded, + meta.AnnotationKeyExternalName, + } +}pkg/reconciler/managed/reconciler.go (1)
1411-1424: Approve the critical annotation persistence logic, with a minor error-handling concern.The new block correctly ensures critical annotations are persisted for resources that may receive non-deterministic external names asynchronously. The comment is helpful in explaining the rationale and future considerations.
However, on line 1422, the error wrapping logic has an issue: if the status update fails, the return statement wraps the status update error with
errUpdateManagedAnnotations, but the annotation update error (err) is lost. Consider either:
- Returning the original annotation error and letting the status update be best-effort, or
- Logging both errors
Additionally, the
gciformatter indicates line 1415 has formatting issues.Consider adjusting the error handling to preserve the original error:
if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil { log.Debug(errUpdateManagedAnnotations, "error", err) record.Event(managed, event.Warning(reasonCannotUpdateManaged, errors.Wrap(err, errUpdateManagedAnnotations))) - return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedAnnotations) + // Best-effort status update; return the original annotation update error + if statusErr := r.client.Status().Update(ctx, managed); statusErr != nil { + log.Debug(errUpdateManagedStatus, "error", statusErr) + } + return reconcile.Result{Requeue: true}, errors.Wrap(err, errUpdateManagedAnnotations) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
pkg/reconciler/managed/api.go(2 hunks)pkg/reconciler/managed/api_test.go(5 hunks)pkg/reconciler/managed/reconciler.go(1 hunks)pkg/reconciler/managed/reconciler_legacy_test.go(40 hunks)pkg/reconciler/managed/reconciler_modern_test.go(49 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not change function/method signatures of exported Go APIs without the 'breaking-change' label
Applied to files:
pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not rename exported Go APIs without the 'breaking-change' label
Applied to files:
pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Avoid behavior changes to exported Go APIs that could break existing users unless labeled 'breaking-change'
Applied to files:
pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not remove exported Go APIs (functions, types, methods, or fields) without the 'breaking-change' label
Applied to files:
pkg/reconciler/managed/api_test.go
🧬 Code graph analysis (5)
pkg/reconciler/managed/api_test.go (3)
pkg/resource/interfaces.go (2)
Object(193-196)LegacyManaged(219-224)pkg/resource/fake/mocks.go (2)
Object(319-322)LegacyManaged(398-405)pkg/test/fake.go (2)
NewMockGetFn(85-95)NewMockPatchFn(163-173)
pkg/reconciler/managed/api.go (2)
pkg/meta/meta.go (4)
AnnotationKeyExternalCreateFailed(56-56)AnnotationKeyExternalCreatePending(42-42)AnnotationKeyExternalCreateSucceeded(51-51)AnnotationKeyExternalName(36-36)pkg/resource/interfaces.go (1)
Object(193-196)
pkg/reconciler/managed/reconciler_legacy_test.go (1)
pkg/test/fake.go (1)
NewMockPatchFn(163-173)
pkg/reconciler/managed/reconciler.go (2)
pkg/event/event.go (2)
Event(39-44)Warning(60-70)pkg/errors/errors.go (1)
Wrap(103-105)
pkg/reconciler/managed/reconciler_modern_test.go (1)
pkg/test/fake.go (1)
NewMockPatchFn(163-173)
🪛 golangci-lint (2.5.0)
pkg/reconciler/managed/api.go
[error] 59-59: criticalAnnotations is a global variable
(gochecknoglobals)
[error] 58-58: File is not properly formatted
(gofumpt)
pkg/reconciler/managed/reconciler.go
[error] 1415-1415: File is not properly formatted
(gci)
🔇 Additional comments (5)
pkg/reconciler/managed/api.go (1)
291-330: LGTM! Patch-based approach is more targeted and addresses the core issue.The refactored implementation correctly:
- Filters to only critical annotations (lines 292-297)
- Short-circuits when no critical annotations exist (lines 299-302)
- Uses JSON merge patch for targeted updates (lines 307-318)
- Handles conflicts by re-fetching and re-applying (lines 319-325)
The field owner name
fieldOwnerAPISimpleRefResolveris reused per the discussion in past review comments where negz suggested using one manager name regardless of operation. This is acceptable.One minor observation: the conflict handling fetches the latest object and calls
meta.AddAnnotations, but then returns the original conflict error. This allows the retry logic to work correctly since the error is propagated up.pkg/reconciler/managed/api_test.go (2)
546-551: LGTM!The helper function correctly renamed from
setLabelstosetAnnotationsto match the new implementation that operates on annotations rather than labels.
562-580: LGTM!Test cases correctly updated to use
MockPatchinstead ofMockUpdate, aligning with the implementation change to use patch-based updates for critical annotations.pkg/reconciler/managed/reconciler_legacy_test.go (1)
333-334: LGTM!The
MockPatchfield additions across all test cases correctly provide the necessary mock infrastructure to support the patch-based critical annotation updates introduced in the implementation. The pattern is consistently applied throughout the test file.pkg/reconciler/managed/reconciler_modern_test.go (1)
100-2697: LGTM! Test scaffolding correctly extended to support patch operations.The mechanical addition of
MockPatch: test.NewMockPatchFn(nil)across the test suite properly supports the new critical annotation update mechanism that uses JSON merge patch operations. The changes are consistent and align with the PR's objective of persisting critical annotations independently of spec late-initialization.Note: The tests "ObserveAndLateInitializePolicy" (lines 2017-2037) and "ObserveUpdateAndLateInitializePolicy" (lines 2042-2066) don't include MockPatch. This may be intentional if these code paths don't exercise patch operations, but you may want to verify this is correct.
|
I just had the idea that a function in the pipeline could also ensure those annotations are persisted. It would not be an ideal solution, but at least a possible hotfix while async is not fully supported. |
Description of your changes
This PR ensures that critical annotations are stored when a resource is created. Some resources have non-deterministic external-names, which are never updated when the
mangementPoliciesdon't containLateInitialize.So far, there is an implicit contract that an Observation updating the external-name will be eventually stored as part of the LateInitialize process. However, that should only affect updates to the
specand not the annotations like external-name.More Context can be found here: crossplane/crossplane#5918
I'm not sure how a unit test can be expressed, as I'm not really familiar with the setup. If the change is approved, I think it would make sense to backport it aswell.
Fixes:
Maybe fixes aswell:
I have:
earthly +reviewableto ensure this PR is ready for review.Added or updated unit tests.Linked a PR or a docs tracking issue to document this change.backport release-x.ylabels to auto-backport this PR.Need help with this checklist? See the cheat sheet.
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.