Skip to content

using resource id as the work metadata name#434

Merged
clyang82 merged 1 commit intoopenshift-online:mainfrom
skeeey:res_id
Jan 14, 2026
Merged

using resource id as the work metadata name#434
clyang82 merged 1 commit intoopenshift-online:mainfrom
skeeey:res_id

Conversation

@skeeey
Copy link
Contributor

@skeeey skeeey commented Dec 11, 2025

@coderabbitai
Copy link

coderabbitai bot commented Dec 11, 2025

Walkthrough

Adds a helper to align payload.metadata.name with Resource.ID in the CloudEvents codec, expands codec unit tests, refactors the gRPC broker to use the codec for encode/decode, introduces dual-source end-to-end tests and helpers, and updates docs/example JSONs to reflect name and status-field changes.

Changes

Cohort / File(s) Summary
CloudEvents codec
pkg/client/cloudevents/codec.go
Adds non-exported helper resetPayloadMetadataNameWithResID(res *api.Resource) error and invokes it from Encode; propagates helper errors.
Codec unit tests
pkg/client/cloudevents/codec_test.go
Adds comprehensive tests for codec creation, data type reporting, Encode (multiple payload scenarios), Decode (success and error paths), helper behavior, and encode/decode roundtrips.
gRPC broker refactor
cmd/maestro/server/grpc_broker.go
Replaces manual CloudEvent extension extraction/construction with codec-based Decode/Encode using a package-level source = "maestro" codec; removes per-field extension wiring.
End-to-end tests & helpers
test/e2e/pkg/sourceclient_test.go
Adds dual-source monitoring e2e test scenario, duplicate watcher contexts, and new helpers AssertReplicas() and NewReadonlyWork().
Docs & example resources
docs/maestro.md, docs/resources/appliedmanifestwork.json, docs/resources/cloudevents.spec.maestro.json, docs/resources/cloudevents.status.maestro.json
Documents that Maestro uses Resource ID as ManifestWork name; updates example JSONs (metadata.name changes) and adds new status fields (sequenceid, statushash) in status sample.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive No pull request description was provided by the author, only a title with no supporting context or details. Add a description explaining why the resource ID should be used as metadata name, what problem this solves, and impact on the system.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main change: using resource ID as metadata name in work objects, which is the core functional change across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@skeeey
Copy link
Contributor Author

skeeey commented Dec 11, 2025

/hold

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
pkg/client/cloudevents/codec.go (1)

120-135: Consider defensive nil check for res.Payload.

The function assumes res.Payload is non-nil. While the caller context (Encode) appears to have the same assumption, adding a defensive check could prevent panics from unexpected nil values.

 func resetPayloadMetadataNameWithResID(res *api.Resource) error {
+	if res.Payload == nil {
+		return nil
+	}
+
 	metadata, ok := res.Payload[cetypes.ExtensionWorkMeta]
 	if !ok {
pkg/client/cloudevents/codec_test.go (3)

130-141: Test case doesn't fully verify the metadata name reset.

The validation function logs the metadata extension but doesn't assert that the metadata name was actually reset to the resource ID. Consider adding an explicit assertion.

 			validateEvent: func(t *testing.T, evt *cloudevents.Event) {
 				ext := evt.Extensions()
 				// Check resource version - CloudEvents stores numeric extensions as int32
 				if version, ok := ext[cetypes.ExtensionResourceVersion].(int32); !ok || version != 2 {
 					t.Errorf("expected resourceVersion 2 but got: %v (type: %T)", ext[cetypes.ExtensionResourceVersion], ext[cetypes.ExtensionResourceVersion])
 				}
-				// Verify that metadata name was reset to resource ID
-				if meta, ok := ext[cetypes.ExtensionWorkMeta]; ok {
-					t.Logf("Work metadata extension found: %v", meta)
+				// Verify that metadata name was reset to resource ID in payload
+				if meta, ok := ext[cetypes.ExtensionWorkMeta].(map[string]interface{}); ok {
+					if meta["name"] != resourceID {
+						t.Errorf("expected metadata name to be %s but got: %v", resourceID, meta["name"])
+					}
 				}
 			},

304-311: Simplify error prefix matching with strings.HasPrefix.

The current logic for checking if an error message starts with the expected prefix is complex and error-prone.

+import "strings"
+
 // Use Contains check for error messages since some include dynamic IDs
-if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-	if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-		t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-	}
-} else {
-	t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
+if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+	t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
 }

340-407: Missing test case for invalid metadata type.

The resetPayloadMetadataNameWithResID function returns an error when metadata is not a map[string]interface{}, but this error path is not tested.

Add a test case to cover the error scenario:

{
    name: "resource with invalid metadata type",
    resource: &api.Resource{
        Meta: api.Meta{
            ID: resourceID,
        },
        Payload: datatypes.JSONMap{
            cetypes.ExtensionWorkMeta: "invalid-string-type",
            "data": map[string]interface{}{
                "manifests": []interface{}{},
            },
        },
    },
    expectedError: true,
},

Then update the test runner to handle the expectedError field.

test/e2e/pkg/sourceclient_test.go (2)

10-10: Consider using standard library or existing dependencies for Contains.

Importing github.com/bxcodec/faker/v3/support/slice solely for slice.Contains introduces an unnecessary dependency. Consider using a simple helper function or k8s.io/apimachinery/pkg/util/sets which is already imported.

-import (
-	"github.com/bxcodec/faker/v3/support/slice"
-)

// Then replace slice.Contains(appliedWorkNames, firstWorkUID) with:
// sets.NewString(appliedWorkNames...).Has(firstWorkUID)

936-946: Fragile JSON string matching for replicas assertion.

Using strings.Contains with an escaped JSON pattern is brittle and may fail if JSON formatting changes. Consider parsing the JSON properly.

 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
+			for _, value := range manifest.StatusFeedbacks.Values {
+				if value.Name == "readyReplicas" && value.Value.Integer != nil {
+					if *value.Value.Integer == int64(replicas) {
+						return nil
+					}
+				}
 			}
 		}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 7f29bb6 and 84921e3.

📒 Files selected for processing (4)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
pkg/client/cloudevents/codec_test.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
  • GitHub Check: upgrade
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
🔇 Additional comments (7)
test/e2e/pkg/suite_test.go (1)

16-25: LGTM!

The added imports for grpc and klog/v2 are correctly used in the file—grpc for the client connection type and klog for structured logging in test reporting and debug output.

pkg/client/cloudevents/codec.go (1)

35-40: LGTM!

The integration of resetPayloadMetadataNameWithResID ensures the payload metadata name is consistently set to the resource ID before encoding, which aligns with the PR objective of using resource ID as the work metadata name.

pkg/client/cloudevents/codec_test.go (3)

18-29: LGTM!

Constructor test properly validates that the codec is created with the expected sourceID.


41-103: LGTM!

Comprehensive test case for encoding resources without metadata, including proper validation of event source, type, and all CloudEvents extensions.


409-479: LGTM!

The roundtrip test comprehensively validates that encoding and decoding preserves resource identity and status information.

test/e2e/pkg/sourceclient_test.go (2)

680-795: LGTM!

The test comprehensively validates that two independent source work clients can monitor the same workload with separate ManifestWorks, and properly handles the lifecycle including creation, status updates, and deletion without interfering with each other.


995-1048: LGTM!

NewReadonlyWork correctly constructs a read-only ManifestWork with appropriate feedback rules for monitoring deployment status without modifying it.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/e2e/pkg/sourceclient_test.go (1)

639-639: Typo in test description.

The word "fist" should be "first".

Apply this diff:

-			By("create fist work client", func() {
+			By("create first work client", func() {
🧹 Nitpick comments (1)
test/e2e/pkg/sourceclient_test.go (1)

922-949: Consider using structured JSON parsing instead of string matching.

The current implementation uses strings.Contains(string(feedbackJson), fmt.Sprintf(...)) which could have false positives. For more robust validation, consider unmarshaling the JSON and checking the field value directly.

Example refactor:

 func AssertReplicas(watchedWorks []*workv1.ManifestWork, name string, replicas int32) error {
 	var latestWatchedWork *workv1.ManifestWork
 	for i := len(watchedWorks) - 1; i >= 0; i-- {
 		if watchedWorks[i].Name == name {
 			latestWatchedWork = watchedWorks[i]
 			break
 		}
 	}
 
 	if latestWatchedWork == nil {
 		return fmt.Errorf("the work %s not watched", name)
 	}
 
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
+			type Feedback struct {
+				ReadyReplicas *int32 `json:"readyReplicas"`
+			}
+			var feedbacks []Feedback
+			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
 			if err != nil {
 				return err
 			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
+			if err := json.Unmarshal(feedbackJson, &feedbacks); err != nil {
+				return err
+			}
+			for _, fb := range feedbacks {
+				if fb.ReadyReplicas != nil && *fb.ReadyReplicas == replicas {
+					return nil
+				}
+			}
-				return nil
-			}
 		}
 	}
 
 	return fmt.Errorf("the expected replicas %d is not found from feedback", replicas)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 84921e3 and 7f860cf.

📒 Files selected for processing (4)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/pkg/suite_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/client/cloudevents/codec_test.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
test/e2e/pkg/sourceclient_test.go (1)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
  • GitHub Check: upgrade
🔇 Additional comments (5)
pkg/client/cloudevents/codec.go (2)

35-39: LGTM! Pre-conversion step correctly ensures payload metadata name.

The pre-conversion step properly calls resetPayloadMetadataNameWithResID and propagates any errors before encoding the resource. This ensures the payload metadata name is unique on the agent side.


120-135: LGTM! Helper function correctly resets payload metadata name.

The helper function correctly:

  • Handles the no-op case when metadata is absent
  • Validates the metadata type and returns an appropriate error
  • Sets the metadata name to the resource ID
  • Mutates the resource in place
test/e2e/pkg/sourceclient_test.go (2)

590-797: LGTM! Comprehensive test for two independent source work clients.

The new test properly validates that two source work clients can independently monitor a single workload, including:

  • Creating two independent ManifestWorks with distinct UIDs
  • Verifying both watchers receive updates
  • Confirming deletion of one work doesn't affect the other
  • Ensuring the underlying deployment UID remains stable

995-1048: LGTM! Well-structured helper for creating read-only ManifestWork.

The NewReadonlyWork function properly constructs a ManifestWork with:

  • Appropriate labels for test identification
  • Read-only update strategy
  • Feedback rules to capture deployment status
pkg/client/cloudevents/codec_test.go (1)

1-482: LGTM! Comprehensive test coverage for codec functionality.

The test suite provides excellent coverage:

  • TestNewCodec validates the constructor
  • TestEventDataType verifies the data type configuration
  • TestEncode covers encoding with and without metadata, with deletion timestamps, and validates all extensions
  • TestDecode covers valid and invalid decoding scenarios including missing extensions and mismatched source IDs
  • TestResetPayloadMetadataNameWithResID validates the helper function behavior for both with and without metadata cases
  • TestEncodeDecodeRoundtrip ensures the full encode/decode cycle maintains data integrity

The error handling in TestResetPayloadMetadataNameWithResID (lines 401-404) correctly checks for unexpected errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/upgrade/pkg/upgrade_test.go (1)

46-57: Consider separating resource creation from readiness polling.

The pattern of wrapping both Get and Create operations inside Eventually blocks means that non-transient creation errors (e.g., validation failures, permission denials) will be retried for the full 5-minute timeout. This can mask real issues and delay failure feedback.

Recommended approach:

  1. Perform the conditional Get/Create once outside Eventually
  2. Use Eventually only to poll for resource readiness/availability after creation

This separates transient readiness delays from permanent creation failures, providing faster feedback when something is misconfigured.

Example for the first block:

-		By("create a deployment for readonly work", func() {
-			Eventually(func() error {
-				_, err := deployClient.Get(ctx, deployReadonlyName, metav1.GetOptions{})
-				if errors.IsNotFound(err) {
-					deploy := utils.NewDeployment(namespace, deployReadonlyName, 0)
-					_, newErr := deployClient.Create(ctx, deploy, metav1.CreateOptions{})
-					return newErr
-				}
-				if err != nil {
-					return err
-				}
-				return nil
-			}).WithTimeout(timeout).WithPolling(polling).ShouldNot(HaveOccurred())
+		By("create a deployment for readonly work", func() {
+			_, err := deployClient.Get(ctx, deployReadonlyName, metav1.GetOptions{})
+			if errors.IsNotFound(err) {
+				deploy := utils.NewDeployment(namespace, deployReadonlyName, 0)
+				_, err = deployClient.Create(ctx, deploy, metav1.CreateOptions{})
+				Expect(err).NotTo(HaveOccurred())
+			} else {
+				Expect(err).NotTo(HaveOccurred())
+			}
+			// If needed, add Eventually block here to wait for deployment readiness

Also applies to: 64-80, 86-102, 108-124

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 76186ff and e7536d7.

📒 Files selected for processing (2)
  • test/upgrade/pkg/upgrade_test.go (12 hunks)
  • test/upgrade/script/run.sh (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
🔇 Additional comments (3)
test/upgrade/script/run.sh (1)

17-17: LGTM: Timeout increase aligns with test robustness improvements.

The 6x timeout increase (5m → 30m) is appropriate given the new polling mechanisms in the Go tests, which introduce multiple 5-minute Eventually blocks that run sequentially during setup.

test/upgrade/pkg/upgrade_test.go (2)

31-34: LGTM: Standardized timeout and polling constants.

The 5-minute timeout and 10-second polling interval provide consistent, reasonable values for eventual consistency checks across all test scenarios.


145-145: LGTM: Explicit timeout and polling improve test reliability.

Consistently applying explicit .WithTimeout(timeout).WithPolling(polling) to all Eventually assertions makes test behavior predictable and reduces flakiness.

Also applies to: 175-175, 194-194, 204-204, 218-218, 239-239, 250-250, 261-261, 290-290, 304-304

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between e7536d7 and 079335a.

📒 Files selected for processing (2)
  • test/e2e/istio/test.sh (1 hunks)
  • test/upgrade/script/run.sh (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e
  • GitHub Check: upgrade
🔇 Additional comments (2)
test/upgrade/script/run.sh (1)

76-76: Good use of explicit job resource for log retrieval.

The change from label-based log selection to explicit job resource (jobs/maestro-upgrade-tests) is more targeted and aligns with the PR objective. Both success and failure paths now retrieve logs consistently.

Also applies to: 82-82

test/e2e/istio/test.sh (1)

84-84: Good use of explicit job resource for log retrieval.

The success path correctly retrieves logs from the specific job resource with the kubeconfig flag, aligning with the PR objective.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cmd/maestro/server/grpc_broker.go (2)

25-33: Importing the shared codec and introducing a source constant improves consistency

Using github.com/openshift-online/maestro/pkg/client/cloudevents plus a single const source = "maestro" keeps encode/decode aligned and makes the CloudEvents source explicit in one place in this file. If "maestro" is (or becomes) used in multiple packages, consider lifting it to a shared constant to avoid accidental drift, but it’s fine as-is here.


308-318: encodeResourceSpec refactor aligns spec encoding with shared Codec behavior

Switching encodeResourceSpec to codec.Encode(source, eventType, resource) keeps spec encoding and status decoding in sync through the same abstraction and reduces duplication of CloudEvent construction details. If the codec is stateless, you could optionally reuse a single instance instead of creating it per call, but that’s a micro-optimization and not required.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2439a89 and 40ecafb.

📒 Files selected for processing (1)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/resource_types.go (1)
  • Resource (14-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
🔇 Additional comments (1)
cmd/maestro/server/grpc_broker.go (1)

302-306: decodeResourceStatus delegation to Codec is a clean simplification

Having decodeResourceStatus do nothing but construct a codec and call Decode removes local parsing details and ensures status decoding logic lives in one place (the codec). This should make it easier to evolve the CloudEvent schema without touching the broker.

@skeeey skeeey force-pushed the res_id branch 2 times, most recently from 4cdf287 to a11abc6 Compare December 12, 2025 04:14
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/e2e/pkg/sourceclient_test.go (1)

800-832: Data race: WatchedResult.WatchedWorks is mutated in a goroutine and read concurrently.
This will flake and should fail under -race. Protect the slice (mutex) or serialize access (channel).

 type WatchedResult struct {
-    WatchedWorks []*workv1.ManifestWork
+    mu          sync.RWMutex
+    WatchedWorks []*workv1.ManifestWork
 }

 func StartWatch(ctx context.Context, watcher watch.Interface) *WatchedResult {
     result := &WatchedResult{WatchedWorks: []*workv1.ManifestWork{}}
     go func() {
         ch := watcher.ResultChan()
@@
                 switch event.Type {
                 case watch.Modified:
                     if work, ok := event.Object.(*workv1.ManifestWork); ok {
-                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Lock()
+                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Unlock()
                     }
                 case watch.Deleted:
                     if work, ok := event.Object.(*workv1.ManifestWork); ok {
-                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Lock()
+                        result.WatchedWorks = append(result.WatchedWorks, work)
+                        result.mu.Unlock()
                     }
                 }
             }
         }
     }()
@@
 }

Then, in readers (e.g., AssertReplicas, AssertWatchResult, Consistently blocks), take an RLock()/RUnlock() or copy a snapshot under lock.

♻️ Duplicate comments (1)
test/e2e/pkg/sourceclient_test.go (1)

639-652: “fist” typo is back (and the message loses useful context).
The log should read “first”, and it’s helpful to include which sourceID is being used to debug auth/routing issues. This was flagged before—looks like a regression.

-            By("create fist work client", func() {
+            By(fmt.Sprintf("create first work client (source=%s)", firstSourceID), func() {
                 var err error
                 firstSourceWorkClient, err = grpcsource.NewMaestroGRPCSourceWorkClient(
                     ctx,
                     logger,
                     apiClient,
                     grpcOptions,
                     firstSourceID,
                 )
🧹 Nitpick comments (6)
pkg/client/cloudevents/codec.go (1)

120-135: Note: Input mutation occurs.

This helper mutates res.Payload in place. Since Encode is a public method and callers may not expect their *api.Resource to be modified, consider documenting this side effect in the Encode method's comment, or working on a copy if preservation of the original is important.

cmd/maestro/server/grpc_broker.go (1)

304-317: Consider reusing the codec instance.

Both decodeResourceStatus and encodeResourceSpec create a new Codec instance on each call. Since the codec is stateless (only stores the sourceID), consider creating a package-level codec instance or caching it to avoid repeated allocations.

+var codecInstance = cloudevents.NewCodec(source)
+
 // decodeResourceStatus translates a CloudEvent into a resource containing the status JSON map.
 func decodeResourceStatus(evt *ce.Event) (*api.Resource, error) {
-	codec := cloudevents.NewCodec(source)
-	return codec.Decode(evt)
+	return codecInstance.Decode(evt)
 }

 // encodeResourceSpec translates a resource spec JSON map into a CloudEvent.
 func encodeResourceSpec(resource *api.Resource) (*ce.Event, error) {
 	eventType := types.CloudEventsType{
 		CloudEventsDataType: workpayload.ManifestBundleEventDataType,
 		SubResource:         types.SubResourceSpec,
 		Action:              types.EventAction("create_request"),
 	}

-	codec := cloudevents.NewCodec(source)
-	return codec.Encode(source, eventType, resource)
+	return codecInstance.Encode(source, eventType, resource)
 }
pkg/client/cloudevents/codec_test.go (1)

304-311: Simplify error message validation.

The current error prefix checking logic using string slicing is complex and could panic if the error message is shorter than expected. Use strings.HasPrefix or strings.Contains instead.

+import "strings"
+
 // Use Contains check for error messages since some include dynamic IDs
-if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-	if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-		t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-	}
-} else {
-	t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-}
+if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+	t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+}
test/setup/env_setup.sh (1)

72-72: Clarify why build step is disabled.

The build step is commented out without explanation. Consider adding a comment explaining why (e.g., "Images are pre-built in CI" or "Uncomment to build images locally").

test/e2e/pkg/sourceclient_test.go (2)

10-10: Avoid github.com/bxcodec/faker/v3/support/slice in prod tests; prefer stdlib slices.Contains.
This extra dep is surprising here; if Go version allows, slices.Contains(appliedWorkNames, firstWorkUID) is clearer and avoids pulling faker utils into e2e tests.


922-949: AssertReplicas is brittle (substring match on marshaled JSON).
This can break with JSON ordering/escaping/schema changes and can produce false positives. Prefer decoding manifest.StatusFeedbacks into a structured form and checking the specific field.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 40ecafb and 4cdf287.

📒 Files selected for processing (7)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
  • test/setup/env_setup.sh (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • test/e2e/pkg/suite_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: upgrade
🔇 Additional comments (3)
pkg/client/cloudevents/codec.go (1)

35-39: LGTM - Ensures unique work metadata name.

The pre-encoding step to reset payload metadata name to resource ID ensures uniqueness on the agent side. Error handling is appropriate.

cmd/maestro/server/grpc_broker.go (1)

32-33: LGTM - Source constant for codec.

Defining the source as a constant improves maintainability and consistency across encode/decode operations.

pkg/client/cloudevents/codec_test.go (1)

412-482: LGTM - Good roundtrip test coverage.

The roundtrip test effectively validates the encode/decode flow by simulating a realistic scenario where a resource is encoded, transformed to a status update, and decoded back.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
docs/maestro.md (1)

38-42: Complete steps 3–5 and fix step 1–2 grammar (currently incomplete/unclear).
Steps 3–5 are placeholders (“The Maestro server…”, “The Maestro agent…”) and don’t describe actions/outcomes; also step 1–2 need minor grammar fixes (“uses”, “to create”, “publishes”).

Proposed patch:

-1. The Consumer (e.g. ClustersService) use Maestro GRPCSourceClient create a ManifestWork
-2. The GRPCSourceClient publish the ManifestWork using CloudEvents
-3. The Maestro server 
-4. The Maestro server 
-5. The Maestro agent
+1. The Consumer (e.g., ClustersService) uses Maestro GRPCSourceClient to create a ManifestWork.
+2. The GRPCSourceClient publishes the ManifestWork as a CloudEvent over MQTT.
+3. The Maestro server receives the CloudEvent, validates it, and persists/enqueues the ManifestWork for processing.
+4. The Maestro server schedules the work to the appropriate agent/cluster, tracks progress, and updates ManifestWork status.
+5. The Maestro agent receives assigned ManifestWork, applies the manifests to the target cluster, and reports status back to the Maestro server.
pkg/client/cloudevents/codec_test.go (1)

105-141: Missing assertion: “encode resource with metadata” doesn’t verify the name reset.
Right now the test only logs the WorkMeta extension, so it won’t fail if the metadata name is not rewritten to resourceID.

 		{
 			name:      "encode resource with metadata",
 			source:    "test-source",
 			eventType: cetypes.CloudEventsType{CloudEventsDataType: workpayload.ManifestBundleEventDataType, SubResource: cetypes.SubResourceSpec, Action: "create"},
 			resource: &api.Resource{
@@
 			},
 			validateEvent: func(t *testing.T, evt *cloudevents.Event) {
 				ext := evt.Extensions()
 				// Check resource version - CloudEvents stores numeric extensions as int32
 				if version, ok := ext[cetypes.ExtensionResourceVersion].(int32); !ok || version != 2 {
 					t.Errorf("expected resourceVersion 2 but got: %v (type: %T)", ext[cetypes.ExtensionResourceVersion], ext[cetypes.ExtensionResourceVersion])
 				}
-				// Verify that metadata name was reset to resource ID
-				if meta, ok := ext[cetypes.ExtensionWorkMeta]; ok {
-					t.Logf("Work metadata extension found: %v", meta)
-				}
+				// Verify that metadata name was reset to resource ID
+				metaAny, ok := ext[cetypes.ExtensionWorkMeta]
+				if !ok {
+					t.Fatalf("expected %q extension to be set", cetypes.ExtensionWorkMeta)
+				}
+				meta, ok := metaAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected %q to be map[string]interface{} but got %T", cetypes.ExtensionWorkMeta, metaAny)
+				}
+				// payload shape here uses: { ..., "metadata": {"name": ...} }
+				mdAny, ok := meta["metadata"]
+				if !ok {
+					t.Fatalf("expected %q to contain \"metadata\" but got: %v", cetypes.ExtensionWorkMeta, meta)
+				}
+				md, ok := mdAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected \"metadata\" to be map[string]interface{} but got %T", mdAny)
+				}
+				if md["name"] != resourceID {
+					t.Errorf("expected metadata.name %q but got: %v", resourceID, md["name"])
+				}
 			},
 		},
🧹 Nitpick comments (1)
pkg/client/cloudevents/codec_test.go (1)

294-312: Brittle error assertion in TestDecode (comment says “Contains”, code does prefix slicing).
This is hard to read and easy to get wrong—prefer strings.HasPrefix (or strings.Contains) directly.

+	"strings"
@@
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+				}
 				return
 			}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4cdf287 and fd380f0.

📒 Files selected for processing (6)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/sourceclient_test.go
  • cmd/maestro/server/grpc_broker.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
🔇 Additional comments (2)
test/e2e/pkg/suite_test.go (1)

16-18: Import changes look consistent with usage (grpc + klog).
grpc.ClientConn and klog are both referenced in this file, so the import adjustments look fine.

Also applies to: 25-25

pkg/client/cloudevents/codec_test.go (1)

340-410: Good: TestResetPayloadMetadataNameWithResID now fails fast on unexpected errors.
Catching the returned error makes the test more reliable.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
pkg/client/cloudevents/codec_test.go (1)

106-143: Fix metadata-name reset assertion to avoid false-green subtests
Your check only runs if ext[cetypes.ExtensionWorkMeta] is a string; if it’s a map[string]any (or something else), the assertion is silently skipped (Lines 137-142). This can let regressions pass. Also, strings.Contains is a weak signal vs asserting the actual metadata.name.

Since this is the same area as the earlier “verify metadata name reset” feedback, I’m marking it duplicate, but the current implementation still has a correctness gap.

 			validateEvent: func(t *testing.T, evt *cloudevents.Event) {
 				ext := evt.Extensions()
 				// Check resource version - CloudEvents stores numeric extensions as int32
 				if version, ok := ext[cetypes.ExtensionResourceVersion].(int32); !ok || version != 2 {
 					t.Errorf("expected resourceVersion 2 but got: %v (type: %T)", ext[cetypes.ExtensionResourceVersion], ext[cetypes.ExtensionResourceVersion])
 				}
-				// Verify that metadata name was reset to resource ID
-				if meta, ok := ext[cetypes.ExtensionWorkMeta].(string); ok {
-					if !strings.Contains(meta, resourceID) {
-						t.Errorf("expected metadata name %s, but got: %v", resourceID, meta)
-					}
-				}
+				// Verify that metadata name was reset to resource ID (fail if the shape is unexpected)
+				metaAny, ok := ext[cetypes.ExtensionWorkMeta]
+				if !ok {
+					t.Fatalf("expected %q extension to be set", cetypes.ExtensionWorkMeta)
+				}
+				metaMap, ok := metaAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected %q extension to be map[string]interface{} but got %T", cetypes.ExtensionWorkMeta, metaAny)
+				}
+				mdAny, ok := metaMap["metadata"]
+				if !ok {
+					t.Fatalf("expected %q.metadata to be set", cetypes.ExtensionWorkMeta)
+				}
+				mdMap, ok := mdAny.(map[string]interface{})
+				if !ok {
+					t.Fatalf("expected %q.metadata to be map[string]interface{} but got %T", cetypes.ExtensionWorkMeta, mdAny)
+				}
+				if mdMap["name"] != resourceID {
+					t.Errorf("expected metadata name %q but got: %v", resourceID, mdMap["name"])
+				}
 			},
🧹 Nitpick comments (2)
pkg/client/cloudevents/codec_test.go (2)

301-314: Replace manual prefix slicing with strings.HasPrefix (and align the comment)
The comment says “Contains check” but the code does a manual prefix compare via slicing (Lines 306-313), which is harder to read/maintain.

 			if c.expectedErrorMsg != "" {
 				if err == nil {
 					t.Errorf("expected error but got nil")
 					return
 				}
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
+				// Use prefix match for error messages that may include dynamic suffixes
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+				}
 				return
 			}

(You already import strings in this file.)


232-239: Don’t ignore evt.SetData(...) errors in tests
Even if it’s unlikely, ignoring these errors makes debugging much harder when something changes upstream.

-				_ = evt.SetData(cloudevents.ApplicationJSON, map[string]interface{}{
+				if err := evt.SetData(cloudevents.ApplicationJSON, map[string]interface{}{
 					"conditions": []interface{}{
 						map[string]interface{}{
 							"type":   "Applied",
 							"status": "True",
 						},
 					},
-				})
+				}); err != nil {
+					t.Fatalf("failed to set event data: %v", err)
+				}

Apply similarly at the other SetData call sites (Lines 289-291, 457-464).

Also applies to: 289-291, 457-464

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between fd380f0 and 9384e6d.

📒 Files selected for processing (1)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: upgrade
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request

@skeeey skeeey force-pushed the res_id branch 2 times, most recently from 86764b3 to dfdfec0 Compare December 19, 2025 06:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
test/setup/env_setup.sh (1)

111-119: Image loading inconsistent with container tool logic and duplicates existing loads.

Lines 111-112 duplicate the maestro image loading already performed in lines 99-100 (Docker) or 103-108 (Podman), but load from a different registry (quay.io/redhat-user-workloads) without version tags. Additionally, all these kind load docker-image commands are placed outside the container tool conditional block (lines 98-109), which means they will fail or behave unexpectedly if the user is using Podman.

🔎 Proposed fix

Remove the duplicate maestro image loads (lines 111-112) and move the additional image loads (lines 113-117) inside the appropriate conditional blocks:

 if [ "$container_tool" = "docker" ]; then
     kind load docker-image ${external_image_registry}/maestro/maestro:$image_tag --name maestro
     kind load docker-image ${external_image_registry}/maestro/maestro-e2e:$image_tag --name maestro
+    kind load docker-image quay.io/nginx/nginx-unprivileged --name=maestro
+    kind load docker-image docker.io/istio/proxyv2:1.26.0 --name=maestro
+    kind load docker-image docker.io/istio/pilot:1.26.0 --name=maestro
+    kind load docker-image quay.io/maestro/postgres:17.2 --name=maestro
+    kind load docker-image quay.io/maestro/eclipse-mosquitto:2.0.18 --name=maestro
 else
     # related issue: https://github.com/kubernetes-sigs/kind/issues/2038
     podman save ${external_image_registry}/maestro/maestro:$image_tag -o /tmp/maestro.tar
     kind load image-archive /tmp/maestro.tar --name maestro
     rm /tmp/maestro.tar
     podman save ${external_image_registry}/maestro/maestro-e2e:$image_tag -o /tmp/maestro-e2e.tar
     kind load image-archive /tmp/maestro-e2e.tar --name maestro
     rm /tmp/maestro-e2e.tar
+    # Load additional images for Podman
+    for img in "quay.io/nginx/nginx-unprivileged" "docker.io/istio/proxyv2:1.26.0" "docker.io/istio/pilot:1.26.0" "quay.io/maestro/postgres:17.2" "quay.io/maestro/eclipse-mosquitto:2.0.18"; do
+        podman save "$img" -o "/tmp/$(echo $img | tr '/:' '_').tar"
+        kind load image-archive "/tmp/$(echo $img | tr '/:' '_').tar" --name maestro
+        rm "/tmp/$(echo $img | tr '/:' '_').tar"
+    done
 fi
-
-kind load docker-image quay.io/redhat-user-workloads/maestro-rhtap-tenant/maestro/maestro-e2e --name=maestro
-kind load docker-image quay.io/redhat-user-workloads/maestro-rhtap-tenant/maestro/maestro --name=maestro
-kind load docker-image quay.io/nginx/nginx-unprivileged --name=maestro
-kind load docker-image docker.io/istio/proxyv2:1.26.0 --name=maestro
-kind load docker-image docker.io/istio/pilot:1.26.0 --name=maestro
-kind load docker-image quay.io/maestro/postgres:17.2 --name=maestro
-kind load docker-image quay.io/maestro/eclipse-mosquitto:2.0.18 --name=maestro
test/e2e/pkg/sourceclient_test.go (2)

639-639: Typo: "fist" should be "first".

-			By("create fist work client", func() {
+			By("create first work client", func() {

670-678: Improve cleanup to handle both works and tolerate failures.

The AfterEach only deletes the work from secondSourceWorkClient. If the test fails before the first work is deleted (line 760), it will leak. Additionally, the cleanup should tolerate NotFound errors to handle partial test failures gracefully.

🔎 Proposed fix
 		AfterEach(func() {
+			// Try to cleanup both works, ignoring NotFound errors
+			err := firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
+			if err != nil && !errors.IsNotFound(err) {
+				// Log but don't fail cleanup
+				GinkgoWriter.Printf("Warning: failed to cleanup first work: %v\n", err)
+			}
+
-			err := secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
+			err = secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
-			Expect(err).ShouldNot(HaveOccurred())
+			if err != nil && !errors.IsNotFound(err) {
+				Expect(err).ShouldNot(HaveOccurred())
+			}
 
 			err = agentTestOpts.kubeClientSet.AppsV1().Deployments("default").Delete(ctx, deployName, metav1.DeleteOptions{})
 			Expect(err).To(Succeed())
 
 			watchCancel()
 		})
🧹 Nitpick comments (2)
pkg/client/cloudevents/codec_test.go (1)

307-313: Consider simplifying error message validation.

The current prefix-matching logic with length checks and slicing is harder to follow than necessary.

🔎 Simpler alternative
-			// Use Contains check for error messages since some include dynamic IDs
-			if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-				if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
-			} else {
-				t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-			}
+			// Use prefix check for error messages since some include dynamic IDs
+			if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+				t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+			}
test/e2e/pkg/sourceclient_test.go (1)

922-949: Consider more robust JSON parsing for status feedback.

The function works correctly but uses string matching on serialized JSON (line 942). While functional, this could be more robust by unmarshaling the feedback into a typed structure.

🔎 Optional improvement
 func AssertReplicas(watchedWorks []*workv1.ManifestWork, name string, replicas int32) error {
 	var latestWatchedWork *workv1.ManifestWork
 	for i := len(watchedWorks) - 1; i >= 0; i-- {
 		if watchedWorks[i].Name == name {
 			latestWatchedWork = watchedWorks[i]
 			break
 		}
 	}
 
 	if latestWatchedWork == nil {
 		return fmt.Errorf("the work %s not watched", name)
 	}
 
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
+			// Parse feedback values more robustly
+			for _, feedback := range manifest.StatusFeedbacks.Values {
+				if feedback.Name == "readyReplicas" {
+					var readyReplicas int32
+					if err := json.Unmarshal([]byte(*feedback.Value.JsonRaw), &readyReplicas); err != nil {
+						return fmt.Errorf("failed to parse readyReplicas: %w", err)
+					}
+					if readyReplicas == replicas {
+						return nil
+					}
+				}
+			}
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
-			}
 		}
 	}
 
 	return fmt.Errorf("the expected replicas %d is not found from feedback", replicas)
 }

Note: This assumes the feedback structure has a Values field with Name and Value.JsonRaw. Adjust based on the actual API structure.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9384e6d and 86764b3.

📒 Files selected for processing (8)
  • Makefile (1 hunks)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
  • test/setup/env_setup.sh (1 hunks)
  • test/upgrade/test.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/suite_test.go
  • cmd/maestro/server/grpc_broker.go
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/pkg/sourceclient_test.go (1)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
🔇 Additional comments (12)
pkg/client/cloudevents/codec_test.go (7)

19-30: LGTM!

The constructor test properly validates codec creation and sourceID assignment.


32-40: LGTM!

The data type validation is correct.


82-103: LGTM!

The validation thoroughly checks all CloudEvents extensions including source, type, resourceID, version, clusterName, and deletion timestamp handling.


164-176: LGTM!

The deletion timestamp validation correctly checks that the extension is set and that event ID and time are populated.


220-295: LGTM!

The decode test cases comprehensively cover valid scenarios and error conditions including invalid type, unsupported data type, missing extensions, and source ID mismatches.


343-417: LGTM!

The metadata reset tests properly validate both scenarios (with and without metadata) and include correct error handling and type assertions as addressed in previous reviews.


419-489: LGTM!

The roundtrip test properly validates that encoding and decoding preserve key resource fields (ID, Version, ConsumerName) and that status data is populated correctly.

test/upgrade/test.sh (1)

23-42: Clarify the intent of commented-out upgrade test steps.

Multiple test steps are commented out (initial deployment, workserver run, e2e tests, agent deployment), which makes the upgrade test workflow incomplete. The PR is marked with /hold which might be related.

Could you clarify:

  1. Is this intentional for phased rollout, or should these steps be re-enabled?
  2. Are there specific blockers preventing these steps from running?
  3. Should these commented lines be removed if they're permanently disabled, or documented with TODO/FIXME if temporarily disabled?

This will help ensure the upgrade test workflow is clear for future maintainers.

Makefile (1)

504-506: LGTM - Clean decoupling of upgrade-test.

Making upgrade-test a standalone target is a reasonable change that allows it to be invoked independently. Preserving the old dependencies as a comment is good practice for maintaining context.

test/e2e/pkg/sourceclient_test.go (3)

5-31: LGTM - Import additions support new test functionality.

The new imports for JSON handling, slice utilities, Deployment types, runtime extensions, pointer helpers, and work client interface are all appropriately used in the new test code.


680-796: LGTM - Well-structured test with comprehensive verification.

The test properly validates that two source clients with different source IDs can independently manage the same workload:

  • Creates two separate works (verified by different UIDs)
  • Confirms two independent AppliedManifestWorks on the agent
  • Verifies status feedback propagates to both watchers
  • Validates deletion of one work doesn't affect the other or the underlying deployment

The use of work.Name throughout is correct given the source client's internal composite ID handling.


995-1048: LGTM - Well-configured read-only work helper.

The NewReadonlyWork helper is well-implemented:

  • Properly uses runtime.RawExtension with the Object field for the Deployment
  • Configures JSONPath feedback rules to capture the entire resource state
  • Correctly sets UpdateStrategyTypeReadOnly to monitor without modifying the deployment
  • Includes appropriate labels for test identification

This enables the test to verify status feedback from existing resources without interfering with their state.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
test/e2e/pkg/sourceclient_test.go (2)

639-639: Fix typo: "fist" should be "first".

The comment has a typo that was previously flagged but not corrected.

🔎 Proposed fix
-			By("create fist work client", func() {
+			By("create first work client", func() {

670-678: Improve cleanup robustness.

The AfterEach only deletes via secondSourceWorkClient, but the test creates two works (one per client). If the test fails before line 760 (where the first work is deleted), the first work won't be cleaned up.

Additionally, watchCancel() is called after the delete operations. If deletes hang or are slow, the watch continues consuming resources.

🔎 Proposed improvements
 		AfterEach(func() {
+			// Cancel watches first to stop consuming resources
+			watchCancel()
+
+			// Clean up both works, ignoring NotFound errors
+			_ = firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
 			err := secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
 			Expect(err).ShouldNot(HaveOccurred())
 
 			err = agentTestOpts.kubeClientSet.AppsV1().Deployments("default").Delete(ctx, deployName, metav1.DeleteOptions{})
 			Expect(err).To(Succeed())
-
-			watchCancel()
 		})
🧹 Nitpick comments (1)
test/e2e/pkg/sourceclient_test.go (1)

922-949: Refactor to use proper JSON unmarshaling instead of string matching.

The function uses string matching on marshaled JSON (line 942) to find readyReplicas, which is fragile and can break with formatting changes or different JSON encoders.

🔎 Proposed refactor
 func AssertReplicas(watchedWorks []*workv1.ManifestWork, name string, replicas int32) error {
 	var latestWatchedWork *workv1.ManifestWork
 	for i := len(watchedWorks) - 1; i >= 0; i-- {
 		if watchedWorks[i].Name == name {
 			latestWatchedWork = watchedWorks[i]
 			break
 		}
 	}
 
 	if latestWatchedWork == nil {
 		return fmt.Errorf("the work %s not watched", name)
 	}
 
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
-			}
+			for _, feedback := range manifest.StatusFeedbacks.Values {
+				if feedback.Name == "resource" {
+					var resource map[string]interface{}
+					if err := json.Unmarshal([]byte(feedback.Value.Raw), &resource); err != nil {
+						return fmt.Errorf("failed to unmarshal resource feedback: %v", err)
+					}
+					if status, ok := resource["status"].(map[string]interface{}); ok {
+						if readyReplicas, ok := status["readyReplicas"].(float64); ok && int32(readyReplicas) == replicas {
+							return nil
+						}
+					}
+				}
+			}
 		}
 	}
 
 	return fmt.Errorf("the expected replicas %d is not found from feedback", replicas)
 }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 86764b3 and dfdfec0.

📒 Files selected for processing (5)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/suite_test.go
  • cmd/maestro/server/grpc_broker.go
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
🔇 Additional comments (3)
pkg/client/cloudevents/codec_test.go (1)

1-489: LGTM! Comprehensive test coverage.

The test file provides thorough coverage of codec operations including creation, encoding, decoding, metadata reset, and roundtrip scenarios. All past review comments have been properly addressed.

test/e2e/pkg/sourceclient_test.go (2)

3-35: LGTM! Appropriate imports for the new test.

The new imports support JSON handling, slice utilities, Kubernetes apps/v1 types, runtime extensions, pointer helpers, and work client interfaces, all of which are used in the new dual-source monitoring test.


995-1048: LGTM! Well-structured read-only work constructor.

The function correctly creates a ManifestWork with read-only update strategy and feedback rules configured to monitor deployment status. This aligns well with the test objective of having two independent sources monitor the same workload without reconciliation conflicts.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
test/e2e/pkg/sourceclient_test.go (2)

639-639: Typo: "fist" should be "first".

The log message still contains "fist" instead of "first".

🔎 Proposed fix
-			By("create fist work client", func() {
+			By("create first work client", func() {

670-678: AfterEach may leave the first work orphaned if the test fails early.

The cleanup only deletes from secondSourceWorkClient. If the test fails before the "delete the first work" step (line 760), the first work will remain. Consider cleaning up both works in AfterEach with errors.IsNotFound checks to handle cases where the work was already deleted.

🔎 Proposed fix
 		AfterEach(func() {
+			// Clean up first work (ignore NotFound if already deleted during test)
+			_ = firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
+
 			err := secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
-			Expect(err).ShouldNot(HaveOccurred())
+			if err != nil && !errors.IsNotFound(err) {
+				Expect(err).ShouldNot(HaveOccurred())
+			}
 
 			err = agentTestOpts.kubeClientSet.AppsV1().Deployments("default").Delete(ctx, deployName, metav1.DeleteOptions{})
 			Expect(err).To(Succeed())
🧹 Nitpick comments (3)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Consider using stdlib slices package instead of faker's slice utility.

The github.com/bxcodec/faker/v3/support/slice import is used only for slice.Contains. Since Go 1.21, the standard library provides slices.Contains which is more appropriate for this use case and avoids pulling in a test data generation library for a simple utility function.

🔎 Proposed fix
-	"github.com/bxcodec/faker/v3/support/slice"
+	"slices"

And update usage at lines 721 and 725:

-	if !slice.Contains(appliedWorkNames, firstWorkUID) {
+	if !slices.Contains(appliedWorkNames, firstWorkUID) {
pkg/client/cloudevents/codec.go (1)

120-135: Consider documenting the mutation side effect.

The function mutates res.Payload in place, which modifies the caller's resource. While this is likely intentional for the encoding path, it would be helpful to document this behavior in a comment to prevent unexpected side effects if the resource is reused elsewhere.

🔎 Proposed documentation improvement
+// resetPayloadMetadataNameWithResID sets the payload metadata name to the resource ID.
+// This function mutates the resource's Payload in place.
 func resetPayloadMetadataNameWithResID(res *api.Resource) error {
pkg/client/cloudevents/codec_test.go (1)

309-316: Simplify error message prefix checking.

The current error checking logic is more complex than needed. Consider using strings.HasPrefix for cleaner code.

🔎 Proposed simplification
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
 				}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between dfdfec0 and 34204d9.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (5)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: upgrade
🔇 Additional comments (10)
test/e2e/pkg/suite_test.go (1)

16-25: LGTM - Import reorganization.

The import changes correctly add the grpc and klog/v2 packages with proper grouping. Both imports are used in the existing code (grpcConn variable and logging functions).

test/e2e/pkg/sourceclient_test.go (1)

995-1048: LGTM - Well-structured helper for read-only ManifestWork.

The NewReadonlyWork helper correctly constructs a read-only ManifestWork with proper feedback rules for monitoring deployment status. The test label enables filtering in the test assertions.

cmd/maestro/server/grpc_broker.go (3)

26-33: LGTM - Clean refactoring to codec-based approach.

The addition of the source constant and cloudevents import provides a clean, consistent approach for CloudEvent encoding/decoding. Using a constant ensures the source identifier is consistent across all usages.


294-297: LGTM - Simplified decode using codec.

The refactoring cleanly delegates CloudEvent decoding to the codec, which handles all extension extraction and validation internally.


300-308: LGTM - Encode refactoring with proper event type configuration.

The encode function properly configures the event type and delegates to the codec, which will now ensure the resource ID is used as the payload metadata name via resetPayloadMetadataNameWithResID.

pkg/client/cloudevents/codec.go (1)

35-42: LGTM - Proper integration of metadata name reset.

The call to resetPayloadMetadataNameWithResID is correctly placed before the CloudEvent conversion and handles errors appropriately. The comment clearly explains the purpose of ensuring unique payload metadata names on the agent side.

pkg/client/cloudevents/codec_test.go (4)

19-40: LGTM - Basic codec construction tests.

The tests properly validate codec initialization and event data type configuration.


42-209: LGTM - Comprehensive encode tests.

The test cases cover the key encoding scenarios: without metadata, with metadata (verifying the name reset to resource ID), and with deletion timestamp. The validation functions properly check extensions and event properties.


345-419: LGTM - Good test coverage for metadata reset helper.

The tests properly validate both paths: no-op when metadata is absent, and name reset when metadata exists. Error handling was properly addressed from previous review feedback.


421-491: LGTM - Valuable roundtrip integration test.

The test validates the complete encode-decode flow, simulating a realistic status update scenario. This ensures the codec maintains field integrity across the transformation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Consider using a lighter dependency for slice utilities.

The faker library is primarily designed for generating fake data in tests. Using it solely for slice.Contains (lines 721, 725) introduces a heavyweight dependency for a simple utility function.

🔎 Recommended alternatives

Option 1 (if Go 1.21+): Use the standard slices package:

-	"github.com/bxcodec/faker/v3/support/slice"
+	"slices"

Then replace slice.Contains(appliedWorkNames, firstWorkUID) with slices.Contains(appliedWorkNames, firstWorkUID).

Option 2: Implement a simple helper:

func contains(slice []string, item string) bool {
	for _, s := range slice {
		if s == item {
			return true
		}
	}
	return false
}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 34204d9 and f8ea7fb.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • docs/resources/appliedmanifestwork.json (1 hunks)
  • docs/resources/cloudevents.spec.maestro.json (1 hunks)
  • docs/resources/cloudevents.status.maestro.json (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/maestro.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • test/e2e/pkg/suite_test.go
  • pkg/client/cloudevents/codec_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/api/resource_types.go (1)
  • Resource (14-27)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
🔇 Additional comments (9)
test/e2e/pkg/sourceclient_test.go (3)

922-949: Note: Fragile string matching for replica checks.

A previous review flagged the string-based JSON matching at line 942 as potentially producing false positives (e.g., replicas=1 matching readyReplicas\":10). This concern remains valid.

Consider properly unmarshaling the JSON or using more precise matching patterns to avoid false positives in replica verification.


590-797: LGTM: Well-designed multi-source test scenario.

The test correctly validates that two independent source clients with different source IDs can monitor the same workload. The design was clarified in previous reviews—each client's operations are automatically scoped by their respective sourceID, enabling independent lifecycle management.


995-1048: LGTM: Read-only ManifestWork helper is well-structured.

The NewReadonlyWork helper correctly constructs a ManifestWork with read-only update strategy and feedback configuration for deployment monitoring.

docs/resources/cloudevents.status.maestro.json (1)

144-144: LGTM: Documentation updated consistently with PR objective.

The metadata name field now reflects the resource ID (55c61e54-a3f6-563d-9fec-b1fe297bdfdb), aligning with the PR's goal of using resource ID as the work metadata name. This change is consistent with updates in other documentation files.

docs/resources/cloudevents.spec.maestro.json (1)

77-77: LGTM: Documentation consistently updated.

The metadata name field correctly reflects the new resource ID-based naming approach, consistent with the PR objective and other documentation updates.

docs/resources/appliedmanifestwork.json (1)

14-14: LGTM: AppliedManifestWork documentation updated consistently.

Both the metadata name (line 14) and spec.manifestWorkName (line 21) correctly reference the new resource ID (55c61e54-a3f6-563d-9fec-b1fe297bdfdb), demonstrating the consistent naming approach introduced in this PR.

Also applies to: 21-21

cmd/maestro/server/grpc_broker.go (3)

26-26: LGTM: Good addition of codec support.

The new import and source constant enable the codec-based refactoring below. The constant is appropriately scoped at the package level for use within this file.

Also applies to: 32-32


295-296: LGTM: Cleaner decoding with codec.

Refactoring to use codec.Decode(evt) simplifies the status decoding logic and improves maintainability by delegating CloudEvent parsing to the codec layer.


307-308: LGTM: Cleaner encoding with codec.

Refactoring to use codec.Encode(source, eventType, resource) improves maintainability by delegating CloudEvent construction to the codec layer, which now handles metadata name synchronization as per the PR objective.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
test/e2e/pkg/sourceclient_test.go (2)

639-639: Typo: "fist" should be "first".

The typo persists in the current code.

🔎 Proposed fix
-			By("create fist work client", func() {
+			By("create first work client", func() {

935-946: Fragile JSON string matching can cause false positives.

The current check strings.Contains(string(feedbackJson), fmt.Sprintf(readyReplicas":%d, replicas)) can match unintended values. For example, checking for replicas=1 would also match readyReplicas":10, readyReplicas":100, etc.

🔎 Proposed fix using proper JSON unmarshaling
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
+			for _, value := range manifest.StatusFeedbacks.Values {
+				if value.Name == "readyReplicas" && value.Value.Integer != nil {
+					if *value.Value.Integer == int64(replicas) {
+						return nil
+					}
+				}
 			}
 		}
 	}

Alternatively, if using raw JSON parsing, ensure the match is exact:

-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
+			// Match readyReplicas with exact value (followed by comma, brace, or whitespace)
+			pattern := fmt.Sprintf(`"readyReplicas":%d[,}\s]`, replicas)
+			matched, _ := regexp.MatchString(pattern, string(feedbackJson))
+			if matched {
🧹 Nitpick comments (2)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Unusual dependency: faker's slice package used only for Contains.

The faker library is primarily designed for generating fake test data, not for general utility functions. Using it just for slice.Contains (at lines 721, 725) introduces an unusual dependency.

Consider using slices.Contains from the standard library (slices package in Go 1.21+) or a simple loop:

🔎 Suggested alternatives

Option 1: Use standard library (Go 1.21+)

-	"github.com/bxcodec/faker/v3/support/slice"
+	"slices"

Then at usage sites:

-				if !slice.Contains(appliedWorkNames, firstWorkUID) {
+				if !slices.Contains(appliedWorkNames, firstWorkUID) {

Option 2: Use a simple helper function

func contains(slice []string, item string) bool {
    for _, s := range slice {
        if s == item {
            return true
        }
    }
    return false
}
pkg/client/cloudevents/codec.go (1)

120-135: Implementation is sound with one technically redundant but harmless assignment.

The helper function correctly handles the no-op case, validates types, and sets the metadata name to the resource ID.

The assignment at line 133 is technically redundant: maps in Go are pointers to runtime.hmap structures, so modifying metaObj["name"] already updates the underlying map in res.Payload. The reassignment has no additional effect and can be safely removed.

Defensively verify that res.ID is never empty when this function is called, though in normal usage this is unlikely since Resources are database entities with IDs always populated.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between f8ea7fb and 4ab0153.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • docs/resources/appliedmanifestwork.json (1 hunks)
  • docs/resources/cloudevents.spec.maestro.json (1 hunks)
  • docs/resources/cloudevents.status.maestro.json (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • docs/maestro.md
  • test/e2e/pkg/suite_test.go
  • docs/resources/cloudevents.status.maestro.json
  • cmd/maestro/server/grpc_broker.go
  • docs/resources/appliedmanifestwork.json
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (24-28)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
test/e2e/pkg/sourceclient_test.go (1)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
pkg/client/cloudevents/codec.go (2)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (11-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
🔇 Additional comments (5)
docs/resources/cloudevents.spec.maestro.json (1)

77-77: LGTM! Documentation correctly reflects the new behavior.

The metadata.name field now matches the resourceid field (line 79), which correctly demonstrates the PR objective of using the resource ID as the work metadata name. The consistency across metadata.name, metadata.uid, and resourceid is appropriate.

test/e2e/pkg/sourceclient_test.go (2)

680-796: LGTM!

The test logic correctly validates that two independent source clients with different source IDs can operate on resources using the same work name, as each client's operations are internally scoped by their source ID. The verification of AppliedManifestWorks, replica status syncing, and independent deletion behavior is well-structured.


995-1048: LGTM!

The NewReadonlyWork helper is well-structured. It correctly:

  • Uses RawExtension.Object for type-safe manifest definition
  • Configures feedback rules to capture the full resource status via "@" JSONPath
  • Sets UpdateStrategyTypeReadOnly to monitor without modifying the deployment
pkg/client/cloudevents/codec.go (1)

35-39: LGTM! Clean integration of metadata name reset.

The pre-encode step to ensure payload metadata name matches the resource ID is well-implemented with proper error handling. This enforces uniqueness on the agent side as intended.

pkg/client/cloudevents/codec_test.go (1)

1-491: Excellent test coverage!

The test suite is comprehensive and well-structured:

  • ✅ Constructor and data type validation
  • ✅ Encoding scenarios (with/without metadata, deletion cases)
  • ✅ Metadata name reset verification (lines 138-144, 388-403)
  • ✅ Decoding with extensive error path coverage
  • ✅ Direct unit test of the resetPayloadMetadataNameWithResID helper
  • ✅ Full encode/decode roundtrip integration test

All previously identified issues from past reviews have been addressed (error handling, type assertions, validation logic). The tests provide strong confidence in the codec implementation.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/e2e/pkg/sourceclient_test.go (1)

922-949: LGTM with a note on string matching.

The AssertReplicas helper correctly searches for the latest watched work and validates the status feedback. The string matching approach on line 942 has been previously flagged as potentially fragile (could match readyReplicas":10 when checking for replicas=1), but the implementation works for the current test scenarios.

🧹 Nitpick comments (2)
test/e2e/pkg/sourceclient_test.go (1)

10-10: Consider using standard library or k8s utilities instead of faker/slice.

The slice.Contains function from the faker library (used on lines 721, 725) could be replaced with sets.Set from k8s.io/apimachinery/pkg/util/sets which is already imported and used elsewhere in the file, or a simple loop. This would reduce external dependencies.

🔎 Proposed refactor using sets.Set

Remove the import:

-	"github.com/bxcodec/faker/v3/support/slice"

Then refactor the checks on lines 721 and 725:

-					if !slice.Contains(appliedWorkNames, firstWorkUID) {
+					if !sets.New(appliedWorkNames...).Has(firstWorkUID) {
 						return fmt.Errorf("the first work %s is not found in %v", firstWorkUID, appliedWorkNames)
 					}

-					if !slice.Contains(appliedWorkNames, secondWorkUID) {
+					if !sets.New(appliedWorkNames...).Has(secondWorkUID) {
 						return fmt.Errorf("the second work %s is not found in %v", secondWorkUID, appliedWorkNames)
 					}
cmd/maestro/server/grpc_broker.go (1)

294-297: Consider using a package-level codec instance.

The function creates a new codec instance on every call. If the codec is stateless and lightweight, this is acceptable. However, if codec initialization has any overhead, consider creating a package-level codec instance to improve performance:

var codec = cloudevents.NewCodec(source)

Then reuse it in both decodeResourceStatus and encodeResourceSpec.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4ab0153 and b03b4c1.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • cmd/maestro/server/grpc_broker.go (2 hunks)
  • docs/maestro.md (1 hunks)
  • docs/resources/appliedmanifestwork.json (1 hunks)
  • docs/resources/cloudevents.spec.maestro.json (1 hunks)
  • docs/resources/cloudevents.status.maestro.json (1 hunks)
  • pkg/client/cloudevents/codec.go (3 hunks)
  • pkg/client/cloudevents/codec_test.go (1 hunks)
  • test/e2e/pkg/sourceclient_test.go (4 hunks)
  • test/e2e/pkg/suite_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • test/e2e/pkg/suite_test.go
  • pkg/client/cloudevents/codec.go
  • docs/maestro.md
  • docs/resources/appliedmanifestwork.json
  • docs/resources/cloudevents.spec.maestro.json
  • pkg/client/cloudevents/codec_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-64)
test/performance/pkg/util/util.go (1)
  • Eventually (30-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e
  • GitHub Check: e2e-broadcast-subscription
🔇 Additional comments (5)
docs/resources/cloudevents.status.maestro.json (2)

144-144: LGTM! Metadata name correctly aligned with resource ID.

The metadata.name field now matches the resourceid (line 146) and metadata.uid, which correctly demonstrates the PR objective of using the resource ID as the work metadata name.


148-149: LGTM! New tracking fields added.

The addition of sequenceid and statushash fields enhances the CloudEvent status object with sequence tracking and status integrity verification capabilities.

test/e2e/pkg/sourceclient_test.go (1)

995-1048: LGTM! Well-structured ReadOnly work configuration.

The NewReadonlyWork helper correctly constructs a ManifestWork with:

  • Proper TypeMeta and ObjectMeta for the Deployment
  • ReadOnly update strategy to prevent modifications
  • Feedback rules configured to capture the full resource status via JSONPath "@"
  • Appropriate labels for test filtering

This configuration aligns well with the test objective of monitoring deployment status changes without modifying the deployment through the manifestwork.

cmd/maestro/server/grpc_broker.go (2)

32-32: LGTM: Good use of constant for source identifier.

Centralizing the source string as a package-level constant improves maintainability and consistency.


300-309: Resource ID metadata mapping is properly implemented; revise source parameter assessment.

The codec's Encode method correctly handles the PR objective. Lines 35-39 call resetPayloadMetadataNameWithResID(), which properly sets payload.metadata.name to the resource ID (line 132 in codec.go).

However, the source parameter usage is not redundant: The sourceID passed to NewCodec() is stored for use in the Decode() method (line 104 validates against codec.sourceID), while the source parameter in Encode() sets the actual event source (line 48). These serve different purposes and both are necessary.

The optional performance consideration about creating a new codec instance on each call remains valid but is low-priority given the codec's lightweight nature (single string field).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
pkg/client/cloudevents/codec_test.go (3)

46-53: Consider adding error test cases for Encode.

The test structure includes expectedErrorMsg field but no test cases exercise error paths. Consider adding cases for nil resource, nil payload, or invalid data to improve error path coverage.


302-315: Simplify error message comparison logic.

The current prefix-matching logic is verbose and harder to maintain. Consider using strings.HasPrefix directly.

Suggested simplification
-			if c.expectedErrorMsg != "" {
-				if err == nil {
-					t.Errorf("expected error but got nil")
-					return
-				}
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
-				return
-			}
+			if c.expectedErrorMsg != "" {
+				if err == nil {
+					t.Errorf("expected error but got nil")
+					return
+				}
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+				}
+				return
+			}

344-418: Good coverage of resetPayloadMetadataNameWithResID helper.

The tests verify both the no-op case (no metadata) and the reset case (metadata with original name). Consider adding a test case for when metadata exists but is not a map[string]interface{} to verify error handling.

test/e2e/pkg/sourceclient_test.go (2)

762-800: Good multi-client isolation verification, minor cleanup suggestion.

The test correctly verifies that deleting one work doesn't affect the other and that the deployment is not recreated.

However, the Eventually loop at line 763 wraps the delete call, which will succeed immediately. The retry is only needed for the subsequent verifications. Consider restructuring for clarity:

Consider separating delete from verification
 By("delete the first work", func() {
+    err := firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{})
+    Expect(err).ShouldNot(HaveOccurred())
+
     Eventually(func() error {
-        // delete one, the other should be not be changed
-        if err := firstSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Delete(ctx, work.Name, metav1.DeleteOptions{}); err != nil {
-            return err
-        }
-
         if _, err := secondSourceWorkClient.ManifestWorks(agentTestOpts.consumerName).Get(ctx, work.Name, metav1.GetOptions{}); err != nil {
             return err
         }
         // ... rest of verification
     }, 1*time.Minute, 1*time.Second).ShouldNot(HaveOccurred())
 })

10-10: Replace faker's slice utility with standard library slices.Contains().

The codebase already uses slices.Contains from the standard library elsewhere (test/factories.go). Replace the github.com/bxcodec/faker/v3/support/slice import with the stdlib slices package at lines 726 and 730 for consistency.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9fc6e03 and 96d52e5.

📒 Files selected for processing (8)
  • cmd/maestro/server/grpc_broker.go
  • docs/maestro.md
  • docs/resources/appliedmanifestwork.json
  • docs/resources/cloudevents.spec.maestro.json
  • docs/resources/cloudevents.status.maestro.json
  • pkg/client/cloudevents/codec.go
  • pkg/client/cloudevents/codec_test.go
  • test/e2e/pkg/sourceclient_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/client/cloudevents/codec.go
  • docs/resources/cloudevents.spec.maestro.json
  • docs/maestro.md
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (23-27)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (23-27)
pkg/api/resource_types.go (1)
  • Resource (14-27)
test/e2e/pkg/sourceclient_test.go (1)
test/upgrade/pkg/upgrade_test.go (1)
  • AssertReplicas (317-332)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: upgrade
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e
  • GitHub Check: verify
🔇 Additional comments (10)
docs/resources/cloudevents.status.maestro.json (1)

144-149: LGTM - Documentation correctly reflects resource ID as metadata name.

The metadata name now matches the resourceid field (55c61e54-a3f6-563d-9fec-b1fe297bdfdb), which aligns with the PR objective. The new sequenceid and statushash fields are correctly added as CloudEvent extensions.

docs/resources/appliedmanifestwork.json (1)

14-21: LGTM - AppliedManifestWork naming updated consistently.

The manifestWorkName now uses the resource ID (55c61e54-a3f6-563d-9fec-b1fe297bdfdb), and the metadata.name correctly follows the pattern {agentID}-{manifestWorkName}. This is consistent with the CloudEvents documentation changes.

cmd/maestro/server/grpc_broker.go (3)

26-33: Good refactor - Centralized source constant and codec import.

Using a package-level constant for the source identifier and importing the codec package improves maintainability and consistency.


294-297: Clean refactor to codec-based decoding.

Delegating to the codec simplifies the function and centralizes CloudEvent handling logic.


299-308: LGTM - Codec-based encoding with proper metadata name reset.

The refactor correctly uses the codec for encoding, which internally resets the payload metadata name to the resource ID before constructing the CloudEvent.

pkg/client/cloudevents/codec_test.go (2)

18-29: LGTM - Basic codec constructor test.

Properly verifies codec creation and sourceID assignment.


420-490: LGTM - Comprehensive roundtrip test.

The test effectively validates that resource identity fields (ID, Version, ConsumerName) are preserved through the encode/decode cycle, simulating the real-world flow of spec → status update.

test/e2e/pkg/sourceclient_test.go (3)

5-10: New imports for extended e2e test coverage.

The added imports support JSON marshaling for status feedback parsing, slice utilities for containment checks, and runtime for RawExtension.


927-954: AssertReplicas helper follows existing pattern.

The implementation mirrors the existing AssertReplicas in test/upgrade/pkg/upgrade_test.go but extends it to handle a slice of watched works by finding the latest matching work. The JSON string matching approach, while fragile, is consistent with the existing codebase pattern.


1000-1053: Well-designed NewReadonlyWork helper.

The helper creates a ManifestWork with ReadOnly update strategy, which is appropriate for monitoring an existing deployment without modifying it. The feedback rule with path @ captures the full resource status.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @test/e2e/pkg/sourceclient_test.go:
- Line 637: Fix the typo in the test description: change the string literal used
in the Ginkgo By() call from "create fist work client" to "create first work
client" (update the By(...) invocation in the test function in
sourceclient_test.go).
🧹 Nitpick comments (4)
test/e2e/pkg/sourceclient_test.go (2)

10-10: Consider using a standard library alternative for slice utilities.

The github.com/bxcodec/faker/v3/support/slice package is imported only for slice.Contains. Using slices.Contains from Go's standard library (Go 1.21+) or a simple helper function would reduce dependencies on a faker library's internal utilities.

♻️ Proposed refactor using standard library
-	"github.com/bxcodec/faker/v3/support/slice"
+	"slices"

Then replace slice.Contains(appliedWorkNames, firstWorkUID) with slices.Contains(appliedWorkNames, firstWorkUID).


943-956: Consider structured JSON parsing for more robust replica assertions.

The current approach uses string matching on marshaled JSON (readyReplicas\":%d), which could be fragile if the JSON structure changes. Consider unmarshaling to a structured type or using JSONPath extraction for reliability.

♻️ Alternative using structured parsing
// Instead of string matching, parse the feedback values
for _, fb := range manifest.StatusFeedbacks.Values {
    if fb.Name == "readyReplicas" && fb.Value.Integer != nil {
        if *fb.Value.Integer == int64(replicas) {
            return nil
        }
    }
}
pkg/client/cloudevents/codec_test.go (2)

307-315: Simplify error message comparison logic.

The current error comparison logic (length check followed by prefix extraction) is convoluted and could be replaced with strings.HasPrefix for clarity.

♻️ Simplified error comparison
-			// Use Contains check for error messages since some include dynamic IDs
-			if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-				if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-				}
-			} else {
-				t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
-			}
+			if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+				t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
+			}

344-418: Consider adding test case for invalid metadata type.

The resetPayloadMetadataNameWithResID function has an error path when metadata is not map[string]interface{}. Adding a test case for this scenario would improve coverage:

➕ Additional test case
{
    name: "resource with invalid metadata type",
    resource: &api.Resource{
        Meta: api.Meta{
            ID: resourceID,
        },
        Payload: datatypes.JSONMap{
            cetypes.ExtensionWorkMeta: "invalid-string-type",
        },
    },
    expectError: true,
},

You'll need to modify the test structure to handle expected errors.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 96d52e5 and 792aee5.

📒 Files selected for processing (8)
  • cmd/maestro/server/grpc_broker.go
  • docs/maestro.md
  • docs/resources/appliedmanifestwork.json
  • docs/resources/cloudevents.spec.maestro.json
  • docs/resources/cloudevents.status.maestro.json
  • pkg/client/cloudevents/codec.go
  • pkg/client/cloudevents/codec_test.go
  • test/e2e/pkg/sourceclient_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/maestro.md
  • cmd/maestro/server/grpc_broker.go
  • docs/resources/cloudevents.status.maestro.json
  • docs/resources/cloudevents.spec.maestro.json
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/client/cloudevents/codec.go (2)
pkg/db/migrations/202311151850_add_resources.go (1)
  • Resource (10-24)
pkg/api/resource_types.go (1)
  • Resource (14-27)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (23-27)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
🔇 Additional comments (9)
docs/resources/appliedmanifestwork.json (1)

14-21: LGTM!

The example JSON correctly reflects the new naming convention where the resource ID (55c61e54-a3f6-563d-9fec-b1fe297bdfdb) is used as the manifestWorkName and as the suffix in metadata.name. The metadata.name follows the expected format <agentID>-<resourceID>.

pkg/client/cloudevents/codec.go (2)

119-134: LGTM!

The helper function correctly handles:

  1. Missing metadata (no-op, allowing agent to use resource ID)
  2. Invalid metadata type (returns descriptive error)
  3. Setting/overwriting the name field with the resource ID

The logic aligns with the PR objective of ensuring unique payload metadata names on the agent side.


33-44: No changes needed. The Encode function's mutation of res.Payload is intentional and part of the documented contract—it ensures the payload metadata name uses the resource ID for uniqueness on the agent side. The tests explicitly validate this behavior, and callers in grpc_broker.go fetch resources solely for encoding without reusing them afterward, making the mutation safe and acceptable.

test/e2e/pkg/sourceclient_test.go (3)

668-686: LGTM!

The cleanup logic properly handles the case where resources may have already been deleted by one client, ignoring NotFound errors appropriately.


688-804: Well-structured E2E test for multi-source scenario.

The test thoroughly validates:

  1. Two independent ManifestWorks can be created for the same workload
  2. Both watchers receive status updates
  3. Deleting one work doesn't affect the other
  4. The underlying deployment is not recreated

The assertions on deployment UID stability (lines 797-799) are particularly valuable for ensuring read-only semantics are maintained.


1003-1056: LGTM!

The NewReadonlyWork helper is well-constructed:

  • Uses UpdateStrategyTypeReadOnly to prevent mutations
  • Configures JSONPath feedback to capture the entire deployment resource
  • Includes appropriate test labels for filtering
pkg/client/cloudevents/codec_test.go (3)

18-39: LGTM!

Basic unit tests correctly verify the Codec constructor and EventDataType method.


41-208: LGTM!

The encode tests cover key scenarios:

  • Resource without metadata (extension verification)
  • Resource with metadata (name reset verification)
  • Resource with deletion timestamp (deletion event handling)

The table-driven approach with custom validation functions provides good flexibility.


420-490: LGTM!

The roundtrip test effectively validates the encode/decode cycle by:

  1. Encoding a spec resource
  2. Simulating an agent status response by modifying the event type and data
  3. Decoding and verifying field preservation

This provides good confidence in the codec's end-to-end behavior.

Signed-off-by: Wei Liu <liuweixa@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
cmd/maestro/server/grpc_broker.go (1)

294-297: Consider reusing the codec instance instead of creating one per call.

The codec is instantiated on every decodeResourceStatus call, but since the source is a package-level constant, the codec could be created once at the package level or as part of the GRPCBrokerService struct for better efficiency.

♻️ Suggested refactor
 const source = "maestro"
+var codec = cloudevents.NewCodec(source)

 // decodeResourceStatus translates a CloudEvent into a resource containing the status JSON map.
 func decodeResourceStatus(evt *ce.Event) (*api.Resource, error) {
-	codec := cloudevents.NewCodec(source)
 	return codec.Decode(evt)
 }

 // encodeResourceSpec translates a resource spec JSON map into a CloudEvent.
 func encodeResourceSpec(resource *api.Resource) (*ce.Event, error) {
 	eventType := types.CloudEventsType{
 		CloudEventsDataType: workpayload.ManifestBundleEventDataType,
 		SubResource:         types.SubResourceSpec,
 		Action:              types.EventAction("create_request"),
 	}

-	codec := cloudevents.NewCodec(source)
 	return codec.Encode(source, eventType, resource)
 }
test/e2e/pkg/sourceclient_test.go (1)

943-956: String-based JSON matching is fragile and could produce false positives.

The check strings.Contains(string(feedbackJson), fmt.Sprintf(readyReplicas":%d, replicas)) could incorrectly match when searching for 1 if readyReplicas is 10, 11, etc.

♻️ Suggested fix using proper JSON unmarshaling
 	for _, manifest := range latestWatchedWork.Status.ResourceStatus.Manifests {
 		if meta.IsStatusConditionTrue(manifest.Conditions, "StatusFeedbackSynced") {
-			feedbackJson, err := json.Marshal(manifest.StatusFeedbacks)
-			if err != nil {
-				return err
-			}
-
-			if strings.Contains(string(feedbackJson), fmt.Sprintf(`readyReplicas\":%d`, replicas)) {
-				return nil
+			for _, feedback := range manifest.StatusFeedbacks.Values {
+				if feedback.Name == "readyReplicas" && feedback.Value.Integer != nil {
+					if *feedback.Value.Integer == int64(replicas) {
+						return nil
+					}
+				}
 			}
 		}
 	}
pkg/client/cloudevents/codec_test.go (1)

302-316: The error message comparison logic is overly complex and potentially incorrect.

The condition at line 308 (len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg)) is redundant since c.expectedErrorMsg is already checked to be non-empty at line 302, and the prefix check should work regardless of length comparison.

♻️ Suggested simplification using strings.HasPrefix
 			if c.expectedErrorMsg != "" {
 				if err == nil {
 					t.Errorf("expected error but got nil")
 					return
 				}
-				// Use Contains check for error messages since some include dynamic IDs
-				if len(c.expectedErrorMsg) > 0 && len(err.Error()) >= len(c.expectedErrorMsg) {
-					if err.Error()[:len(c.expectedErrorMsg)] != c.expectedErrorMsg {
-						t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
-					}
-				} else {
-					t.Errorf("expected error %q but got: %q", c.expectedErrorMsg, err.Error())
+				if !strings.HasPrefix(err.Error(), c.expectedErrorMsg) {
+					t.Errorf("expected error to start with %q but got: %q", c.expectedErrorMsg, err.Error())
 				}
 				return
 			}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 792aee5 and 32f7a51.

📒 Files selected for processing (8)
  • cmd/maestro/server/grpc_broker.go
  • docs/maestro.md
  • docs/resources/appliedmanifestwork.json
  • docs/resources/cloudevents.spec.maestro.json
  • docs/resources/cloudevents.status.maestro.json
  • pkg/client/cloudevents/codec.go
  • pkg/client/cloudevents/codec_test.go
  • test/e2e/pkg/sourceclient_test.go
✅ Files skipped from review due to trivial changes (1)
  • docs/maestro.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/client/cloudevents/codec.go
  • docs/resources/cloudevents.status.maestro.json
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/client/cloudevents/codec_test.go (3)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (23-27)
pkg/errors/errors.go (1)
  • New (102-116)
pkg/api/metadata_types.go (1)
  • Meta (51-56)
test/e2e/pkg/sourceclient_test.go (2)
pkg/client/cloudevents/grpcsource/client.go (1)
  • NewMaestroGRPCSourceWorkClient (19-63)
test/performance/pkg/util/util.go (1)
  • Eventually (31-50)
cmd/maestro/server/grpc_broker.go (2)
pkg/client/cloudevents/codec.go (1)
  • NewCodec (23-27)
pkg/api/resource_types.go (1)
  • Resource (14-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e
🔇 Additional comments (13)
docs/resources/appliedmanifestwork.json (1)

14-21: LGTM!

The documentation correctly reflects the new naming convention where spec.manifestWorkName uses the resource ID (55c61e54-a3f6-563d-9fec-b1fe297bdfdb), and metadata.name follows the expected format of hubHash-manifestWorkName.

docs/resources/cloudevents.spec.maestro.json (1)

77-79: LGTM!

The CloudEvent spec documentation correctly demonstrates the alignment where the embedded metadata.name (55c61e54-a3f6-563d-9fec-b1fe297bdfdb) matches the resourceid extension, consistent with the new codec behavior that resets payload metadata name to the resource ID.

cmd/maestro/server/grpc_broker.go (1)

300-308: LGTM!

The encodeResourceSpec function correctly constructs the CloudEvents type and delegates encoding to the codec. The event type setup with ManifestBundleEventDataType, SubResourceSpec, and create_request action is appropriate for the spec encoding use case.

test/e2e/pkg/sourceclient_test.go (5)

5-28: LGTM!

The new imports are appropriate for the added test functionality: encoding/json for deployment unmarshaling, appsv1 for deployment types, runtime for RawExtension, ptr for pointer utilities, and workv1client for the typed work client interface.


606-666: LGTM!

The test setup is well-structured:

  • Creates unique source IDs to ensure test isolation
  • Properly configures gRPC auth rules before creating source clients
  • Creates the deployment on the agent side first (with 0 replicas) to enable ReadOnly monitoring
  • Both source clients are set up with independent watchers

668-686: LGTM!

The cleanup logic correctly handles the case where works may have already been deleted during the test by ignoring NotFound errors, and properly cleans up all resources including the deployment and watch context.


688-804: LGTM!

This test thoroughly validates the dual-source client scenario:

  1. Creates works from two independent sources targeting the same workload
  2. Verifies both AppliedManifestWorks are created with distinct UIDs
  3. Validates status synchronization to both watchers after replica update
  4. Confirms deleting one work doesn't affect the other or recreate the deployment

The test properly exercises the new codec-based encode/decode flow with real-world scenarios.


1003-1056: LGTM!

The NewReadonlyWork helper correctly creates a ManifestWork with:

  • ReadOnly update strategy for monitoring existing resources without modification
  • Feedback rules to capture the entire resource status via @ path
  • Appropriate label (maestro.e2e.test.name=monitor) for test filtering
pkg/client/cloudevents/codec_test.go (5)

1-16: LGTM!

Good imports and package structure for the test file. The imports cover all necessary dependencies for CloudEvents, UUID generation, and the internal API types.


18-29: LGTM!

Basic constructor test validates the codec is properly initialized with the provided source ID.


41-208: LGTM!

Comprehensive encode test coverage with three well-chosen cases:

  1. Basic encoding without metadata - validates core extensions
  2. Encoding with existing metadata - validates the name reset to resource ID
  3. Encoding with deletion timestamp - validates deletion handling

The table-driven approach is clean and maintainable.


344-418: LGTM!

Good unit test coverage for the resetPayloadMetadataNameWithResID helper function, validating both cases: resources without metadata remain unchanged, and resources with metadata have their name field reset to the resource ID.


420-490: LGTM!

Excellent roundtrip test that validates the encode/decode cycle by:

  1. Encoding a resource as a spec event
  2. Modifying it to simulate a status update response
  3. Decoding and verifying key fields are preserved

This provides confidence in the codec's end-to-end correctness.

@skeeey
Copy link
Contributor Author

skeeey commented Jan 13, 2026

/unhold

@skeeey
Copy link
Contributor Author

skeeey commented Jan 13, 2026

/cc @qiujian16 @clyang82 @machi1990

Copy link
Contributor

@clyang82 clyang82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@clyang82 clyang82 merged commit d6b01ad into openshift-online:main Jan 14, 2026
12 checks passed
@skeeey skeeey deleted the res_id branch January 14, 2026 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants