Skip to content

Conversation

@akafazov
Copy link
Contributor

@akafazov akafazov commented Nov 4, 2025

Add gates in the featuretoggles subroutine thus preventing exponential backoff when the kcp artefacts are not ready yet. Also checks for secret before trying to apply KCP manifests.

refers to #92

…tProxy in Process method

On-behalf-of: @SAP [email protected]
Signed-off-by: Angel Kafazov <[email protected]>
@akafazov akafazov force-pushed the fix/featuretoggle-kcp-gates branch from cf4549c to f410bc2 Compare November 4, 2025 11:26
@akafazov akafazov marked this pull request as draft November 4, 2025 13:10
@github-actions github-actions bot added the fix label Nov 4, 2025
@akafazov akafazov changed the title feat(featuretoggles): add readiness checks for KCP RootShard and FrontProxy in Process method fix(featuretoggles): add readiness checks for KCP RootShard and FrontProxy in Process method Nov 4, 2025
…ets from Kind test suite

On-behalf-of: @SAP [email protected]
Signed-off-by: Angel Kafazov <[email protected]>
@akafazov akafazov marked this pull request as ready for review November 4, 2025 15:03
@akafazov
Copy link
Contributor Author

akafazov commented Nov 6, 2025

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 6, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@ntnn
Copy link
Member

ntnn commented Nov 6, 2025

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 6, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Nov 6, 2025

Walkthrough

Loads operator configuration and gates the GettingStarted flow on KCP readiness: verifies KCP RootShard and FrontProxy unstructured resources have an Available condition and verifies the KCP admin secret exists; missing or not-ready resources cause a requeue. Updates FeatureGettingStarted signature to accept operatorCfg. Tests mock unstructured Get calls and status conditions and adjust secret lookup. E2E kind setup no longer creates GitHub/registry Docker config secrets. CI Taskfile removes GH_TOKEN printing and adds an explicit exit on failure.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding readiness checks for KCP RootShard and FrontProxy in the featuretoggles Process method, which aligns with the primary modifications in the changeset.
Description check ✅ Passed The description is directly related to the changeset, explaining the purpose of adding gates to prevent exponential backoff and secret checks before applying KCP manifests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/subroutines/featuretoggles_test.go (1)

57-60: Populate OperatorConfig with required KCP configuration values.

The test creates an empty OperatorConfig, but the code under test now requires KCP-specific configuration fields (KCP.RootShardName, KCP.FrontProxyName, KCP.Namespace, KCP.ClusterAdminSecretName). Without these values, the test queries resources with empty names and namespaces, which does not validate the actual runtime behavior.

Apply this diff to populate the required configuration:

 func (s *FeaturesTestSuite) TestProcess() {
-	operatorCfg := config.OperatorConfig{}
+	operatorCfg := config.OperatorConfig{
+		KCP: config.KCPConfig{
+			RootShardName:            "root-shard",
+			FrontProxyName:           "front-proxy",
+			Namespace:                "kcp-system",
+			ClusterAdminSecretName:   "kcp-admin-kubeconfig",
+		},
+	}

And update the secret mock expectation at lines 63-67:

 	// Mock the kubeconfig secret lookup
 	s.clientMock.EXPECT().
 		Get(mock.Anything, types.NamespacedName{
-			Name:      "",
-			Namespace: "",
+			Name:      "kcp-admin-kubeconfig",
+			Namespace: "kcp-system",
 		}, mock.AnythingOfType("*v1.Secret")).
🧹 Nitpick comments (3)
pkg/subroutines/featuretoggles.go (2)

67-67: Consider passing operator configuration to avoid redundant context lookups.

The operator configuration is loaded from context twice: once in Process (line 67) and again in FeatureGettingStarted (line 112). While not a critical issue, this introduces minor overhead.

Consider passing the loaded config as a parameter to FeatureGettingStarted:

-func (r *FeatureToggleSubroutine) FeatureGettingStarted(ctx context.Context, inst *corev1alpha1.PlatformMesh) (ctrl.Result, errors.OperatorError) {
+func (r *FeatureToggleSubroutine) FeatureGettingStarted(ctx context.Context, inst *corev1alpha1.PlatformMesh, operatorCfg config.OperatorConfig) (ctrl.Result, errors.OperatorError) {
 	log := logger.LoadLoggerFromContext(ctx).ChildLogger("subroutine", r.GetName())
 
 	// Implement the logic to enable the getting started feature
 	log.Info().Msg("Getting started feature enabled")
 
-	operatorCfg := pmconfig.LoadConfigFromContext(ctx).(config.OperatorConfig)
-

And update the call site at line 97:

-return r.FeatureGettingStarted(ctx, inst)
+return r.FeatureGettingStarted(ctx, inst, operatorCfg)

Also applies to: 112-112


67-89: Separate error handling from condition checking for improved observability.

The readiness gates at lines 75 and 86 combine API Get errors with condition mismatches in a single log statement, making it difficult to distinguish between retrieval failures (missing resources, RBAC issues) and condition availability. The MatchesCondition function is accessible within the same package.

Apply the following improvement to both RootShard and FrontProxy gates:

 // Gate on KCP RootShard readiness
 rootShard := &unstructured.Unstructured{}
 rootShard.SetGroupVersionKind(schema.GroupVersionKind{Group: "operator.kcp.io", Version: "v1alpha1", Kind: "RootShard"})
-if err := r.client.Get(ctx, types.NamespacedName{
+if err := r.client.Get(ctx, types.NamespacedName{
 	Name:      operatorCfg.KCP.RootShardName,
 	Namespace: operatorCfg.KCP.Namespace,
-}, rootShard); err != nil || !MatchesCondition(rootShard, "Available") {
-	log.Info().Msg("RootShard is not ready.. Retry in 5 seconds")
+}, rootShard); err != nil {
+	log.Info().Err(err).Str("name", operatorCfg.KCP.RootShardName).Msg("Failed to get RootShard.. Retry in 5 seconds")
+	return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
+}
+if !MatchesCondition(rootShard, "Available") {
+	log.Info().Str("name", operatorCfg.KCP.RootShardName).Msg("RootShard Available condition not met.. Retry in 5 seconds")
 	return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
 }
pkg/subroutines/featuretoggles_test.go (1)

88-104: Consider distinguishing between RootShard and FrontProxy mocks for test clarity.

The unstructured mock returns the same response for all unstructured Get calls, which includes both RootShard and FrontProxy lookups. While this works for a happy path test, explicitly distinguishing between the two resource types would improve test clarity and make failures easier to diagnose.

Consider using separate mock expectations with MatchedBy to distinguish resource types:

// Mock RootShard lookup
s.clientMock.EXPECT().Get(mock.Anything, 
	types.NamespacedName{Name: "root-shard", Namespace: "kcp-system"},
	mock.MatchedBy(func(obj client.Object) bool {
		u, ok := obj.(*unstructured.Unstructured)
		return ok && u.GetKind() == "RootShard"
	})).
	RunAndReturn(func(ctx context.Context, nn types.NamespacedName, obj client.Object, opts ...client.GetOption) error {
		unstructuredObj := obj.(*unstructured.Unstructured)
		unstructuredObj.Object = map[string]interface{}{
			"status": map[string]interface{}{
				"conditions": []interface{}{
					map[string]interface{}{"type": "Available", "status": "True"},
				},
			},
		}
		return nil
	})

// Similar mock for FrontProxy
📜 Review details

Configuration used: Repository: platform-mesh/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7edbfdb and c15a50f.

📒 Files selected for processing (4)
  • Taskfile.yml (1 hunks)
  • pkg/subroutines/featuretoggles.go (3 hunks)
  • pkg/subroutines/featuretoggles_test.go (3 hunks)
  • test/e2e/kind/suite_kind_test.go (0 hunks)
💤 Files with no reviewable changes (1)
  • test/e2e/kind/suite_kind_test.go
🔇 Additional comments (2)
Taskfile.yml (1)

86-91: Correct removal of GitHub token handling from CI test flow.

The CI test command modification appropriately removes GH_TOKEN dependency while preserving test execution logic and error diagnostics. The error handler structure is correct, with explicit exit code and kubectl diagnostics retained.

This aligns with the PR objective to remove GitHub registry authentication from test setup procedures.

pkg/subroutines/featuretoggles.go (1)

114-124: The GetSecret function has been verified to exist at pkg/subroutines/subroutine_helpers.go:76. Since the function is defined within the same package, it is accessible to featuretoggles.go without requiring any additional imports. The code usage is correct.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Repository: platform-mesh/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c15a50f and 8f10c1e.

📒 Files selected for processing (2)
  • pkg/subroutines/featuretoggles.go (3 hunks)
  • pkg/subroutines/featuretoggles_test.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/subroutines/featuretoggles.go (3)
internal/config/config.go (1)
  • OperatorConfig (4-38)
pkg/subroutines/kcpsetup.go (1)
  • MatchesCondition (143-160)
pkg/subroutines/subroutine_helpers.go (1)
  • GetSecret (76-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pipe / dockerBuild / docker
  • GitHub Check: pipe / testSource / test
  • GitHub Check: Analyze (go)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/subroutines/featuretoggles.go (1)

70-98: Consider extracting duplicate gating logic into a helper function.

The RootShard and FrontProxy readiness checks follow an identical pattern, differing only in resource names and log messages. Extracting this logic into a helper function would reduce duplication and improve maintainability.

Example helper function:

func (r *FeatureToggleSubroutine) checkKCPResourceReady(
	ctx context.Context,
	gvk schema.GroupVersionKind,
	name, namespace string,
) (bool, errors.OperatorError) {
	log := logger.LoadLoggerFromContext(ctx).ChildLogger("subroutine", r.GetName())
	
	resource := &unstructured.Unstructured{}
	resource.SetGroupVersionKind(gvk)
	
	if err := r.client.Get(ctx, types.NamespacedName{
		Name:      name,
		Namespace: namespace,
	}, resource); err != nil {
		if kerrors.IsNotFound(err) {
			log.Info().Str("name", name).Str("kind", gvk.Kind).Msg("Resource not found yet.. Retry in 5 seconds")
			return false, nil
		}
		return false, errors.NewOperatorError(errors.Wrap(err, "Failed to get "+gvk.Kind), true, true)
	}
	
	if !MatchesCondition(resource, "Available") {
		log.Info().Str("name", name).Str("kind", gvk.Kind).Msg("Resource Available condition not met.. Retry in 5 seconds")
		return false, nil
	}
	
	return true, nil
}

Then use it in Process:

// Gate on KCP RootShard readiness
ready, err := r.checkKCPResourceReady(
	ctx,
	schema.GroupVersionKind{Group: "operator.kcp.io", Version: "v1alpha1", Kind: "RootShard"},
	operatorCfg.KCP.RootShardName,
	operatorCfg.KCP.Namespace,
)
if err != nil {
	return ctrl.Result{}, err
}
if !ready {
	return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
}

// Gate on KCP FrontProxy readiness
ready, err = r.checkKCPResourceReady(
	ctx,
	schema.GroupVersionKind{Group: "operator.kcp.io", Version: "v1alpha1", Kind: "FrontProxy"},
	operatorCfg.KCP.FrontProxyName,
	operatorCfg.KCP.Namespace,
)
if err != nil {
	return ctrl.Result{}, err
}
if !ready {
	return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
}
📜 Review details

Configuration used: Repository: platform-mesh/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f10c1e and 2bc7abc.

📒 Files selected for processing (1)
  • pkg/subroutines/featuretoggles.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/subroutines/featuretoggles.go (2)
internal/config/config.go (1)
  • OperatorConfig (4-38)
pkg/subroutines/kcpsetup.go (1)
  • MatchesCondition (143-160)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pipe / dockerBuild / docker
  • GitHub Check: pipe / testSource / test
  • GitHub Check: Analyze (go)
🔇 Additional comments (1)
pkg/subroutines/featuretoggles.go (1)

121-135: Well-implemented secret existence check.

The secret check correctly addresses the past review comment by calling r.client.Get directly, which allows kerrors.IsNotFound to work properly without error-wrapping interference. The error handling appropriately distinguishes between expected NotFound scenarios (requeue with 5s delay) and actual errors (return operator error).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Repository: platform-mesh/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2bc7abc and 732eb68.

📒 Files selected for processing (1)
  • pkg/subroutines/featuretoggles.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/subroutines/featuretoggles.go (2)
internal/config/config.go (1)
  • OperatorConfig (4-38)
pkg/subroutines/kcpsetup.go (1)
  • MatchesCondition (143-160)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pipe / dockerBuild / docker
  • GitHub Check: pipe / testSource / test
  • GitHub Check: Analyze (go)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants