Skip to content

Conversation

@twobiers
Copy link

@twobiers twobiers commented Jun 25, 2025

Description of your changes

This PR ensures that critical annotations are stored when a resource is created. Some resources have non-deterministic external-names, which are never updated when the mangementPolicies don't contain LateInitialize.
So far, there is an implicit contract that an Observation updating the external-name will be eventually stored as part of the LateInitialize process. However, that should only affect updates to the spec and not the annotations like external-name.
More Context can be found here: crossplane/crossplane#5918

I'm not sure how a unit test can be expressed, as I'm not really familiar with the setup. If the change is approved, I think it would make sense to backport it aswell.

Fixes:

Maybe fixes aswell:

I have:

Need help with this checklist? See the cheat sheet.

Summary by CodeRabbit

  • Bug Fixes

    • Improved persistence of critical annotations during reconciliation and API retries by using targeted patch operations instead of full object updates, reducing unintended side effects.
    • Enhanced conflict handling to ensure critical annotations remain intact when API server retries occur.
  • Tests

    • Extended test infrastructure to support patch operation mocking across reconciliation test scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@twobiers twobiers requested a review from a team as a code owner June 25, 2025 17:44
@twobiers twobiers requested a review from phisco June 25, 2025 17:44
@jeanduplessis
Copy link
Contributor

@twobiers, we are interested in progressing the solution you propose in this PR. Are you currently able to give this attention? If so, would you mind resolving the conflicts and pushing up an updated version? If you aren't currently able to give this attention would you be happy for us to take it over and get it over the line (you'll be credited with the contribution)?

As the UpdateCriticialAnnotations function is now not exclusively called in the creation process, we have to ensure no other fields like the spec are updated, so we don't interfer with the normal LateInitialize logic

Signed-off-by: twobiers <[email protected]>
@twobiers
Copy link
Author

twobiers commented Sep 4, 2025

@jeanduplessis Thanks for your kind comment. I'm able to work on this and have rebased the branch onto master.

@jeanduplessis
Copy link
Contributor

Thanks, @twobiers. @erhancagirici will be doing a thorough review in the next few days, with a specific focus on making sure it's compatible with Upjet's async approach.

@bobh66
Copy link
Contributor

bobh66 commented Oct 8, 2025

Closing and reopening to kick the CI

@bobh66 bobh66 closed this Oct 8, 2025
@bobh66 bobh66 reopened this Oct 8, 2025
@bobh66 bobh66 moved this to In Review in Crossplane Roadmap Oct 24, 2025
@bobh66 bobh66 added this to the v2.1 milestone Oct 24, 2025
Copy link
Contributor

@erhancagirici erhancagirici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I've left some comments regarding the mechanics, otherwise LGTM conceptually. As you mentioned in the doc comments, this will be sort of a band-aid until a more async-aware implementation.

// after the creation, but later as part of an asynchronous process.
// When Crossplane supports asynchronous creation of resources natively, this logic
// might not be needed anymore and can be revisited.
if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a thing to consider (possible nit):
most reconciliation loops will enter this codepath regardless of a need to update critical annotations. I wonder if this one is no-op in terms of k8s API access or brings some extra load on the apiserver.

Especially in async creations with long-running creation times, currently (with no native async support) the external clients return observation.ResourceExists as true to avoid further actions for the observations during creation of the resource.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should make writing these annotations optional. I'd suggest making them opt-out since that's the safest path. The option would be specified by the provider author - so they can disable these annotations if they know for sure they're not needed (i.e. naming is deterministic, API is strongly consistent). I remember discussing this with someone recently, but can't find a tracking issue.

(If we do make them optional, I think we could do it pretty easily by injecting a no-op implementation of the annotation updater.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should also be straightforward to keep a state in the reconciler whether a critical annotation was added/changed and depending on that update the annotations or perform a no-op.

On the other hand I'm wondering whether this kind of optimization is needed for "Critical" annotations. Shouldn't the priority be here to add those annotations? Otherwise it would leave room for errors.

return err
}

err = u.client.Patch(ctx, o, client.RawPatch(types.MergePatchType, patchData), client.FieldOwner(fieldOwnerAPISimpleRefResolver), client.ForceOwnership)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we patch all annotations here.
Just wondering how does this interact with the field ownerships of annotations that XP does not manage (like a custom annotation set by a user or some other controllers, tooling etc), and whether it is a thing to worry about regarding server-side apply etc. Would we cause a race regarding field ownerships?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@ulucinar ulucinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @twobiers for working on this issue. Left some comments for us to discuss.

err := retry.OnError(retry.DefaultRetry, func(err error) bool {
return !errors.Is(err, context.Canceled)
}, func() error {
err := u.client.Update(ctx, o)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this update operation is being replaced by SSA.

return err
}

err = u.client.Patch(ctx, o, client.RawPatch(types.MergePatchType, patchData), client.FieldOwner(fieldOwnerAPISimpleRefResolver), client.ForceOwnership)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had better use a different manager name than managed.crossplane.io/api-simple-reference-resolver as the annotations have nothing to do with the API resolver. Maybe something like: managed.crossplane.io/critical-annotation-updater?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for the MR reconciler to just use one manager name, regardless of operation?

Possibly too late if we're already using api-simple-reference-resolver for some.

err := u.client.Update(ctx, o)
patchMap := map[string]interface{}{
"metadata": map[string]any{
"annotations": a,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be more selective in what's being patched here, i.e., not all the annotations on the MR are the critical ones and we would only like to manage the critical annotations by this manager? This will probably not be an issue as this manager will not have an opinion on "non-critical" annotations and their respective managers will dictate their values. But one potential issue is when the other manager (who should really be owning the annotation) actually wants to delete the annotation. The managed.crossplane.io/critical-annotation-updater will still be owning the annotation. So the critical-annotation-updater had better not own non-critical ones...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a reasonable addition for this function to have knowledge about what is a "critical annotation". Currently that comes from the execution order of the reconciler, this way we would make it explicit.
From what I can see, the ciritcal annotations should be:

  • crossplane.io/external-name
  • crossplane.io/external-create-pending
  • crossplane.io/external-create-succeeded
  • crossplane.io/external-create-failed

Am I missing any?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a very naive approach in 11e2136

Comment on lines +1411 to +1424
if observation.ResourceExists {
// When a resource exists or is just created, it might have received
// a non-deterministic external name after its creation, which we need to persist.
// We do this by updating the critical annotations.
// This is needed because some resources might not receive an external-name directly
// after the creation, but later as part of an asynchronous process.
// When Crossplane supports asynchronous creation of resources natively, this logic
// might not be needed anymore and can be revisited.
if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil {
log.Debug(errUpdateManagedAnnotations, "error", err)
record.Event(managed, event.Warning(reasonCannotUpdateManaged, errors.Wrap(err, errUpdateManagedAnnotations)))
return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedAnnotations)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @lsviben explains here, upjet should not be relying on setting ResourceLateInitialized to get the critical annotations updated in the first place. The async mode implemented by upjet breaks the assumptions of the managed reconciler. I believe we need to first address this discrepancy between upjet and the managed reconciler...

@jbw976 jbw976 removed this from the v2.1 milestone Oct 31, 2025
@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

Walkthrough

The changes refactor the critical annotation update mechanism to use JSON merge patch operations for annotation-only updates rather than modifying entire objects. A new reconciliation block calls this mechanism after resource creation or existence detection. Test scaffolding is extended to mock patch operations across multiple test suites.

Changes

Cohort / File(s) Summary
Core annotation update logic
pkg/reconciler/managed/api.go
Introduces package-level variable criticalAnnotations and updates RetryingCriticalAnnotationUpdater.UpdateCriticalAnnotations to use JSON merge patch strategy. Builds a map of critical annotations present on the object, short-circuits if none exist, and applies patch via client.Patch with MergePatchType. On conflict, fetches latest object and reapplies annotations while preserving existing changes.
Core annotation tests
pkg/reconciler/managed/api_test.go
Updates test helper from setLabels to setAnnotations and replaces MockUpdate with MockPatch mocks to align with annotation-based patch operations.
Reconciler integration
pkg/reconciler/managed/reconciler.go
Adds new blocks to call UpdateCriticalAnnotations after resource creation or existence detection. Logs debug messages and emits warning events on update failures, with status requeue.
Test scaffolding for legacy reconciler
pkg/reconciler/managed/reconciler_legacy_test.go
Extends mock client setup across numerous test cases by adding MockPatch: test.NewMockPatchFn(nil) to enable patch operation handling in test scenarios.
Test scaffolding for modern reconciler
pkg/reconciler/managed/reconciler_modern_test.go
Adds MockPatch field with test.NewMockPatchFn(nil) across multiple test case configurations and applies minor cosmetic formatting adjustments to align MockGet entries.

Sequence Diagram

sequenceDiagram
    participant Reconciler
    participant UpdateCritical as UpdateCriticalAnnotations
    participant Client
    participant APIServer

    Reconciler->>UpdateCritical: UpdateCriticalAnnotations(obj)
    UpdateCritical->>UpdateCritical: Build map of critical<br/>annotations from object
    alt No annotations present
        UpdateCritical-->>Reconciler: Early return
    else Annotations found
        UpdateCritical->>Client: Patch(MergePatchType,<br/>metadata.annotations)
        alt Patch succeeds
            Client->>APIServer: Apply JSON merge patch
            APIServer-->>Client: Updated object
            Client-->>UpdateCritical: Success
            UpdateCritical-->>Reconciler: Return
        else Conflict error
            Client-->>UpdateCritical: Conflict
            UpdateCritical->>Client: Get(latest object)
            Client->>APIServer: Fetch latest
            APIServer-->>Client: Latest object
            Client-->>UpdateCritical: Object data
            UpdateCritical->>UpdateCritical: Reapply critical<br/>annotations
            UpdateCritical->>Client: Patch(MergePatchType)
            Client->>APIServer: Retry merge patch
            APIServer-->>Client: Updated
            Client-->>UpdateCritical: Success
            UpdateCritical-->>Reconciler: Return
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Areas requiring extra attention:

  • pkg/reconciler/managed/api.go: Review merge patch JSON construction logic, conflict retry mechanism, and field ownership handling in the Patch call—verify that only metadata.annotations are updated and that conflict recovery preserves concurrent changes.
  • pkg/reconciler/managed/reconciler.go: Verify placement of the new UpdateCriticalAnnotations calls ensures critical annotations are persisted after both creation and existence detection without introducing unintended requeue loops.
  • pkg/reconciler/managed/reconciler_legacy_test.go and reconciler_modern_test.go: Ensure MockPatch additions align with the new patch-based code paths and that test expectations accurately reflect the modified reconciliation flow.

Poem

🐰 Patches now whisper soft and true,
Annotations dance in merge-patch brew,
No conflicts here—we fetch, we mend,
Critical notes on which we depend! 🪵

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Better support for non-deterministic external-names by updating critical annotations' accurately summarizes the main change: implementing critical annotation updates to support non-deterministic external names.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/reconciler/managed/api_test.go (1)

610-623: The "Success" test case appears to be misconfigured - it expects an error.

The test case named "Success" at lines 610-623 is configured with MockPatch: test.NewMockPatchFn(errBoom) which returns an error, and the expected result is err: errors.Wrap(errBoom, errUpdateCriticalAnnotations). This contradicts the test name and reason which states "We should return without error if we successfully update our annotations."

This appears to be a copy-paste issue from the "UpdateError" test case above it.

Apply this diff to fix the test:

 		"Success": {
 			reason: "We should return without error if we successfully update our annotations",
 			c: &test.MockClient{
 				MockGet:   test.NewMockGetFn(nil, setAnnotations),
-				MockPatch: test.NewMockPatchFn(errBoom),
+				MockPatch: test.NewMockPatchFn(nil),
 			},
 			args: args{
 				o: &fake.LegacyManaged{},
 			},
 			want: want{
-				err: errors.Wrap(errBoom, errUpdateCriticalAnnotations),
+				err: nil,
 				o:   &fake.LegacyManaged{},
 			},
 		},
🧹 Nitpick comments (2)
pkg/reconciler/managed/api.go (1)

58-65: Global variable is intentional for configuration, but consider using constants or a function.

The gochecknoglobals lint warning is expected here. Since this list is read-only and defines critical configuration, it's acceptable as a package-level variable. The list correctly matches the annotations defined in pkg/meta/meta.go.

The gofumpt formatting issue on line 58 should be addressed - likely just needs a blank line adjustment or formatting fix.

Consider running gofumpt -w on this file to fix the formatting, or alternatively convert this to a function to satisfy the linter:

-var (
-	criticalAnnotations = []string{
-		meta.AnnotationKeyExternalCreateFailed,
-		meta.AnnotationKeyExternalCreatePending,
-		meta.AnnotationKeyExternalCreateSucceeded,
-		meta.AnnotationKeyExternalName,
-	}
-)
+func criticalAnnotationKeys() []string {
+	return []string{
+		meta.AnnotationKeyExternalCreateFailed,
+		meta.AnnotationKeyExternalCreatePending,
+		meta.AnnotationKeyExternalCreateSucceeded,
+		meta.AnnotationKeyExternalName,
+	}
+}
pkg/reconciler/managed/reconciler.go (1)

1411-1424: Approve the critical annotation persistence logic, with a minor error-handling concern.

The new block correctly ensures critical annotations are persisted for resources that may receive non-deterministic external names asynchronously. The comment is helpful in explaining the rationale and future considerations.

However, on line 1422, the error wrapping logic has an issue: if the status update fails, the return statement wraps the status update error with errUpdateManagedAnnotations, but the annotation update error (err) is lost. Consider either:

  1. Returning the original annotation error and letting the status update be best-effort, or
  2. Logging both errors

Additionally, the gci formatter indicates line 1415 has formatting issues.

Consider adjusting the error handling to preserve the original error:

 		if err := r.managed.UpdateCriticalAnnotations(ctx, managed); err != nil {
 			log.Debug(errUpdateManagedAnnotations, "error", err)
 			record.Event(managed, event.Warning(reasonCannotUpdateManaged, errors.Wrap(err, errUpdateManagedAnnotations)))
-			return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedAnnotations)
+			// Best-effort status update; return the original annotation update error
+			if statusErr := r.client.Status().Update(ctx, managed); statusErr != nil {
+				log.Debug(errUpdateManagedStatus, "error", statusErr)
+			}
+			return reconcile.Result{Requeue: true}, errors.Wrap(err, errUpdateManagedAnnotations)
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 756e2d0 and 11e2136.

📒 Files selected for processing (5)
  • pkg/reconciler/managed/api.go (2 hunks)
  • pkg/reconciler/managed/api_test.go (5 hunks)
  • pkg/reconciler/managed/reconciler.go (1 hunks)
  • pkg/reconciler/managed/reconciler_legacy_test.go (40 hunks)
  • pkg/reconciler/managed/reconciler_modern_test.go (49 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not change function/method signatures of exported Go APIs without the 'breaking-change' label

Applied to files:

  • pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not rename exported Go APIs without the 'breaking-change' label

Applied to files:

  • pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Avoid behavior changes to exported Go APIs that could break existing users unless labeled 'breaking-change'

Applied to files:

  • pkg/reconciler/managed/api_test.go
📚 Learning: 2025-10-16T17:11:48.360Z
Learnt from: CR
Repo: crossplane/crossplane-runtime PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2025-10-16T17:11:48.360Z
Learning: Applies to **/!(*_test).go : Do not remove exported Go APIs (functions, types, methods, or fields) without the 'breaking-change' label

Applied to files:

  • pkg/reconciler/managed/api_test.go
🧬 Code graph analysis (5)
pkg/reconciler/managed/api_test.go (3)
pkg/resource/interfaces.go (2)
  • Object (193-196)
  • LegacyManaged (219-224)
pkg/resource/fake/mocks.go (2)
  • Object (319-322)
  • LegacyManaged (398-405)
pkg/test/fake.go (2)
  • NewMockGetFn (85-95)
  • NewMockPatchFn (163-173)
pkg/reconciler/managed/api.go (2)
pkg/meta/meta.go (4)
  • AnnotationKeyExternalCreateFailed (56-56)
  • AnnotationKeyExternalCreatePending (42-42)
  • AnnotationKeyExternalCreateSucceeded (51-51)
  • AnnotationKeyExternalName (36-36)
pkg/resource/interfaces.go (1)
  • Object (193-196)
pkg/reconciler/managed/reconciler_legacy_test.go (1)
pkg/test/fake.go (1)
  • NewMockPatchFn (163-173)
pkg/reconciler/managed/reconciler.go (2)
pkg/event/event.go (2)
  • Event (39-44)
  • Warning (60-70)
pkg/errors/errors.go (1)
  • Wrap (103-105)
pkg/reconciler/managed/reconciler_modern_test.go (1)
pkg/test/fake.go (1)
  • NewMockPatchFn (163-173)
🪛 golangci-lint (2.5.0)
pkg/reconciler/managed/api.go

[error] 59-59: criticalAnnotations is a global variable

(gochecknoglobals)


[error] 58-58: File is not properly formatted

(gofumpt)

pkg/reconciler/managed/reconciler.go

[error] 1415-1415: File is not properly formatted

(gci)

🔇 Additional comments (5)
pkg/reconciler/managed/api.go (1)

291-330: LGTM! Patch-based approach is more targeted and addresses the core issue.

The refactored implementation correctly:

  1. Filters to only critical annotations (lines 292-297)
  2. Short-circuits when no critical annotations exist (lines 299-302)
  3. Uses JSON merge patch for targeted updates (lines 307-318)
  4. Handles conflicts by re-fetching and re-applying (lines 319-325)

The field owner name fieldOwnerAPISimpleRefResolver is reused per the discussion in past review comments where negz suggested using one manager name regardless of operation. This is acceptable.

One minor observation: the conflict handling fetches the latest object and calls meta.AddAnnotations, but then returns the original conflict error. This allows the retry logic to work correctly since the error is propagated up.

pkg/reconciler/managed/api_test.go (2)

546-551: LGTM!

The helper function correctly renamed from setLabels to setAnnotations to match the new implementation that operates on annotations rather than labels.


562-580: LGTM!

Test cases correctly updated to use MockPatch instead of MockUpdate, aligning with the implementation change to use patch-based updates for critical annotations.

pkg/reconciler/managed/reconciler_legacy_test.go (1)

333-334: LGTM!

The MockPatch field additions across all test cases correctly provide the necessary mock infrastructure to support the patch-based critical annotation updates introduced in the implementation. The pattern is consistently applied throughout the test file.

pkg/reconciler/managed/reconciler_modern_test.go (1)

100-2697: LGTM! Test scaffolding correctly extended to support patch operations.

The mechanical addition of MockPatch: test.NewMockPatchFn(nil) across the test suite properly supports the new critical annotation update mechanism that uses JSON merge patch operations. The changes are consistent and align with the PR's objective of persisting critical annotations independently of spec late-initialization.

Note: The tests "ObserveAndLateInitializePolicy" (lines 2017-2037) and "ObserveUpdateAndLateInitializePolicy" (lines 2042-2066) don't include MockPatch. This may be intentional if these code paths don't exercise patch operations, but you may want to verify this is correct.

@twobiers
Copy link
Author

twobiers commented Dec 4, 2025

I just had the idea that a function in the pipeline could also ensure those annotations are persisted. It would not be an ideal solution, but at least a possible hotfix while async is not fully supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

7 participants