Skip to content

add e2e test for pubsub.#454

Merged
clyang82 merged 3 commits intoopenshift-online:mainfrom
morvencao:br_add_pubsub_e2e
Jan 14, 2026
Merged

add e2e test for pubsub.#454
clyang82 merged 3 commits intoopenshift-online:mainfrom
morvencao:br_add_pubsub_e2e

Conversation

@morvencao
Copy link
Contributor

@coderabbitai
Copy link

coderabbitai bot commented Dec 25, 2025

Walkthrough

Adds Pub/Sub support across CI, tests, deployment scripts, and Helm charts: new e2e CI job for pubsub, test skip for missing CA secrets, Pub/Sub client and emulator Helm templates/values/init jobs, and test deploy script updates to inject Pub/Sub configuration and conditional rollout handling.

Changes

Cohort / File(s) Summary
CI/CD Workflow
.github/workflows/e2e.yml
Adds e2e-pubsub job mirroring existing e2e steps but sets MESSAGE_DRIVER_TYPE=pubsub, SERVER_REPLICAS=2, ENABLE_MAESTRO_TLS=true, and uses Docker in the Test E2E step.
E2E Tests (cert rotation)
test/e2e/pkg/cert_rotation_test.go
Adds skip logic: BeforeAll checks for maestro-mqtt-ca and maestro-grpc-broker-ca secrets and calls Skip if both missing; AfterAll bypasses restoration/cleanup when skipped.
E2E Tests (status resync)
test/e2e/pkg/status_resync_test.go
Removes an AfterAll sequence that created/deleted a manifest work and asserted not-found (removes those connectivity verification steps).
Test Deploy Scripts
test/setup/deploy_agent.sh, test/setup/deploy_server.sh
Injects Pub/Sub client config into Helm values (projectID, endpoint, disableTLS, topics, subscriptions) and adds pubsubEmulator settings; conditional rollout logic: use kubectl rollout restart + kubectl rollout status when Pub/Sub mode is active.
Maestro Agent Helm
charts/maestro-agent/templates/pubsub-init-hook.yaml, charts/maestro-agent/templates/secret.yaml, charts/maestro-agent/values.yaml
Adds conditional init Job to create Pub/Sub subscriptions for a consumer; secret template optionally emits endpoint and always emits disableTLS; values.yaml adds endpoint and disableTLS keys.
Maestro Server Helm
charts/maestro-server/templates/pubsub-emulator.yaml, charts/maestro-server/templates/pubsub-init-hook.yaml, charts/maestro-server/templates/pubsub.yaml, charts/maestro-server/values.yaml
Adds conditional Pub/Sub emulator Service/Deployment and init Job to create topics/subscriptions; server config conditionally includes endpoint and always includes disableTLS; values.yaml adds emulator and Pub/Sub client fields.

Sequence Diagram(s)

sequenceDiagram
  actor Operator
  participant Helm
  participant Kubernetes
  participant PubSubEmulator
  participant InitJob
  participant Maestro

  Operator->>Helm: helm install/upgrade (values include pubsub & emulator)
  Helm->>Kubernetes: create Service, Deployment, Secrets, Hook Job
  Kubernetes->>PubSubEmulator: start emulator Deployment (if enabled)
  Kubernetes->>InitJob: run pubsub-init-hook Job
  InitJob->>PubSubEmulator: create topics & subscriptions via client
  InitJob-->>Kubernetes: Job completes
  Kubernetes->>Maestro: deploy Maestro components with pubsub config (endpoint/disableTLS)
  Maestro->>PubSubEmulator: connect, publish & subscribe to topics
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title 'add e2e test for pubsub' directly matches the main objective of adding end-to-end tests for Pub/Sub message broker functionality across the codebase.
Description check ✅ Passed The description mentions adding an end-to-end test for Pub/Sub and references a dependency, which aligns with the changes that add pubsub e2e test infrastructure and configuration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings


📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5336f1b and b0f9a73.

📒 Files selected for processing (1)
  • test/e2e/pkg/status_resync_test.go
💤 Files with no reviewable changes (1)
  • test/e2e/pkg/status_resync_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-pubsub
  • GitHub Check: e2e-broadcast-subscription

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
go.mod (1)

209-210: Establish a migration plan for the fork dependency.

Using a personal fork (morvencao/ocm-sdk-go) as a replace directive creates a temporal dependency that requires clear documentation and a removal plan. While the fork is currently accessible and maintained, this pattern is not sustainable long-term—it ties the build to an external personal account and PR status.

PR #182 ("support test mode for pubsub driver") is currently open (updated Dec 25). Ensure:

  1. Once upstream PR #182 is merged, immediately remove this replace directive and update the require statement on line 60 to reference the merged version.
  2. Document this temporary workaround in the code or PR description with the expected timeline for removal (tied to PR #182's merge).
  3. Add a tracking issue or comment linking this dependency to PR #182 so the team knows when the replacement can be removed.
🧹 Nitpick comments (2)
test/e2e/pkg/cert_rotation_test.go (1)

80-83: Clarify the assertion failure message.

The message "no CA secrets found; certificate rotation did not run" is misleading at this point in the code. If execution reaches line 83, it means BeforeAll did not skip, so at least one CA secret exists. A false rotated value would indicate rotation failed for a different reason (e.g., invalid secret data).

🔎 Suggested improvement
-Expect(rotated).To(BeTrue(), "no CA secrets found; certificate rotation did not run")
+Expect(rotated).To(BeTrue(), "expected at least one certificate to be rotated")
test/setup/env_setup.sh (1)

193-222: LGTM with a minor observation about the sleep.

The Pub/Sub emulator setup follows the established pattern for other message brokers. The sleep 5 on line 205 is a pragmatic approach for waiting for emulator readiness, though in slower CI environments this might occasionally be insufficient.

If flakiness is observed in CI, consider replacing the fixed sleep with a retry loop that checks the emulator's health, for example by attempting a simple Pub/Sub API call until it succeeds.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 0adb891 and 6a0704e.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • go.mod
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-pubsub
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: upgrade
  • GitHub Check: e2e
🔇 Additional comments (23)
test/e2e/pkg/cert_rotation_test.go (2)

32-32: LGTM: Skip flag declaration.

The skip flag is appropriately used to track whether certificate rotation tests should be skipped when CA secrets are absent.


34-52: LGTM: Conditional test execution based on CA secret presence.

The logic correctly handles three scenarios:

  • Both CA secrets missing → skip (Pub/Sub emulator scenario)
  • At least one CA secret present → continue (MQTT/gRPC scenario)
  • Unexpected errors → fail

This aligns well with the PR objective of supporting Pub/Sub as an alternative message driver.

.gitignore (1)

50-50: LGTM!

The new ignore rule for secrets/pubsub.config follows the existing pattern for other secret configuration files in this directory.

README.md (3)

45-79: LGTM!

The documentation clearly explains both MQTT and Pub/Sub setup paths, maintains consistency with existing style, and properly notes the Python package dependency for Pub/Sub initialization.


114-122: LGTM!

The Pub/Sub runtime instructions are clear and follow the same pattern as the MQTT instructions.


361-366: LGTM!

The KinD cluster instructions for Pub/Sub are concise and align with the MESSAGE_DRIVER_TYPE environment variable used throughout the codebase.

Makefile (2)

65-70: LGTM!

The Pub/Sub configuration variables follow the same pattern as the MQTT configuration variables, with sensible defaults for local development.


446-461: Pub/Sub lifecycle targets look good overall.

The targets follow a similar lifecycle pattern to the MQTT targets. One minor observation: the pubsub/init target relies on python3 and the google-cloud-pubsub package being available in the local environment, which is documented in the README. The emulator image (gcr.io/google.com/cloudsdktool/google-cloud-cli:emulators) is accessible and available.

hack/init-pubsub-emulator.py (4)

1-11: LGTM!

The script is well-documented with clear purpose and environment variable usage.


19-66: LGTM!

The server topics and subscriptions initialization logic is well-structured with proper error handling. The broad Exception catches (flagged by static analysis) are appropriate here since this is an initialization script that needs to handle various potential API errors from the Pub/Sub client library, and the code properly logs the error and returns False to signal failure.


69-117: LGTM!

The agent subscription initialization follows the same pattern as the server initialization with consistent error handling.


120-147: LGTM!

The main function properly handles environment variables with sensible defaults and exits with appropriate codes on failure. The emulator_host variable is correctly used for logging purposes (the library reads PUBSUB_EMULATOR_HOST environment variable directly).

templates/service-template.yml (1)

219-222: LGTM!

The dynamic secret name maestro-${MESSAGE_DRIVER_TYPE} enables the template to work with different message drivers. Making the secret optional: true is appropriate since not all message driver configurations require a secret (e.g., gRPC may use different authentication mechanisms).

templates/service-tls-template.yml (1)

275-278: LGTM!

The change is consistent with the non-TLS service template, enabling dynamic message driver configuration.

templates/README.md (1)

33-56: Documentation references are accurate; both GCP templates exist.

The referenced GCP templates (service-template-gcp.yml and agent-template-gcp.yml) are present in the repository. The Pub/Sub emulator documentation is comprehensive and correct.

.github/workflows/e2e.yml (1)

119-145: LGTM! Pub/Sub e2e job follows existing patterns.

The new e2e-pubsub job is well-structured and consistent with the existing e2e jobs. The environment variables (MESSAGE_DRIVER_TYPE=pubsub, SERVER_REPLICAS=2, ENABLE_MAESTRO_TLS=true) are appropriate for testing Pub/Sub message driver integration.

templates/agent-template.yml (2)

72-85: LGTM! Pub/Sub parameters properly defined.

The new parameters (PUBSUB_HOST, PUBSUB_PORT, PUBSUB_PROJECT_ID) follow the existing pattern for message driver configuration and have sensible defaults for the emulator environment.


367-381: Clarify that this template is for development/testing with the Pub/Sub emulator, not production.

The insecure: true flag is appropriate for this template since it's designed to work with the Pub/Sub emulator (which runs on localhost:8085 without TLS). However, the template documentation should make clear that agent-template.yml is for local development and e2e testing. For production GCP deployments, use agent-template-gcp.yml instead, which does not include the insecure flag and is intended for actual GCP Pub/Sub endpoints.

test/setup/deploy_agent.sh (2)

49-51: LGTM! Pub/Sub configuration variables properly initialized.

The variables are appropriately scoped and use consistent naming with the template parameters.


55-78: [rewritten review comment]
[classification tag]

templates/agent-tls-template.yml (2)

72-85: LGTM! Pub/Sub parameters consistent with agent-template.yml.

The parameters are correctly defined and match the non-TLS variant, maintaining consistency across agent templates.


383-397: The template is explicitly designed for Pub/Sub emulator testing and development, not production deployments. Per the templates README, production GCP deployments use the separate agent-template-gcp.yml template. The insecure: true flag is appropriate for this template's intended scope. No changes needed.

Likely an incorrect or invalid review comment.

templates/pubsub-template.yml (1)

102-116: The Pub/Sub topic and subscription naming is already consistent across all templates and initialization scripts. No inconsistencies were found.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
hack/init-pubsub-emulator.py (1)

35-37: Consider more specific exception handling.

The static analysis tool flags catching bare Exception as a code smell. For production code, catching specific exception types (e.g., exceptions.GoogleAPICallError) would be preferable. However, for an initialization script that logs errors and exits, the current approach is pragmatic and acceptable.

If you want to be more specific, consider:

🔎 More specific exception handling
         except exceptions.AlreadyExists:
             print(f"  - Topic already exists: {topic_name}")
-        except Exception as e:
+        except exceptions.GoogleAPICallError as e:
             print(f"  ✗ Error creating topic {topic_name}: {e}", file=sys.stderr)
             return False

This pattern would apply to lines 62, 113, and 145 as well. However, the current implementation is acceptable for an init script.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 6a0704e and 8bbc798.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • go.mod
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (9)
  • Makefile
  • README.md
  • test/setup/deploy_agent.sh
  • go.mod
  • templates/agent-tls-template.yml
  • templates/README.md
  • templates/agent-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/service-tls-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-pubsub
🔇 Additional comments (9)
.gitignore (1)

50-50: ✓ Looks good!

The new ignore rule for secrets/pubsub.config follows the existing pattern for secret files and correctly prevents Pub/Sub configuration from being accidentally committed to the repository.

test/e2e/pkg/cert_rotation_test.go (2)

123-125: LGTM! Correctly addresses the past review comment.

The implementation now uses an early return in AfterAll when the tests are skipped, which is the idiomatic pattern for conditional cleanup in Ginkgo lifecycle hooks. The Skip() call in BeforeAll (line 43) is appropriate for skipping test specs.


34-52: Well-designed skip mechanism for Pub/Sub compatibility.

The conditional skip logic properly handles the case where certificate rotation tests are not applicable (e.g., when using Pub/Sub emulator). The implementation:

  • Checks for both MQTT and gRPC CA secrets
  • Skips only when both are absent
  • Maintains proper error handling for unexpected failures
hack/init-pubsub-emulator.py (1)

19-147: Well-structured initialization script.

The script is clearly organized with:

  • Separate functions for server and agent initialization
  • Proper idempotency via AlreadyExists exception handling
  • Clear success/failure reporting with exit codes
  • Environment variable configuration with sensible defaults
test/setup/env_setup.sh (1)

193-222: LGTM! Pub/Sub setup follows established patterns.

The Pub/Sub emulator setup is well-integrated and follows the same pattern as the existing MQTT and gRPC broker setup:

  1. Deploy the emulator
  2. Wait for availability
  3. Initialize topics/subscriptions via a templated job
  4. Wait for job completion
  5. Clean up the job

The implementation is consistent with the rest of the script.

.github/workflows/e2e.yml (1)

119-145: LGTM! E2E workflow job for Pub/Sub is well-configured.

The new e2e-pubsub job is structured consistently with the existing e2e-grpc-broker job and properly configures:

  • MESSAGE_DRIVER_TYPE: pubsub to enable Pub/Sub testing
  • SERVER_REPLICAS: 2 for multi-instance testing
  • ENABLE_MAESTRO_TLS: true for secure communication
  • container_tool: docker for the CI environment
templates/pubsub-template.yml (1)

1-116: LGTM! Well-designed Pub/Sub emulator template.

The template properly defines the Pub/Sub emulator infrastructure:

  • Service exposing port 8085
  • Deployment running the emulator with appropriate configuration
  • Secret containing server-side configuration (topics to publish, subscriptions to consume)

The use of emptyDir for storage (lines 98-99) is appropriate for an emulator in testing scenarios. Based on past review discussions, the :emulators image tag (line 22) is intentionally chosen for stability, and the topic configuration aligns with the broader Pub/Sub integration across the repository.

templates/service-template.yml (1)

219-222: The dynamic secret reference and optional flag are correctly implemented.

The maestro-${MESSAGE_DRIVER_TYPE} secret configuration with optional: true is appropriate. The maestro-pubsub secret is properly created by pubsub-template.yml (lines 103-120), and similarly, maestro-mqtt is created by mqtt-template.yml. The optional flag prevents deployment failure when a particular message driver is not deployed.

Note: The MESSAGE_DRIVER_TYPE parameter description in service-template.yml should be updated from "mqtt or grpc" to include "pubsub" now that Pub/Sub is a supported driver type.

templates/pubsub-init-job-template.yml (1)

52-110: google-cloud-pubsub 2.34.0 is stable and suitable for this use case.

Version 2.34.0 was released December 16, 2025, and is tested for Python 3.7–3.14. It is an actively maintained, stable release. The operations used in this script (creating topics and subscriptions with filters) are fully supported by the Pub/Sub emulator. While the emulator has known limitations with UpdateTopic/UpdateSnapshot RPC and some advanced features, they do not affect the basic topic and subscription creation logic here.

The subscription filter attributes.ce-originalsource="maestro" and the overall topic/subscription initialization approach are correct.

@morvencao
Copy link
Contributor Author

/assign @clyang82

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 8bbc798 and 5b0276f.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • go.mod
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (8)
  • templates/service-tls-template.yml
  • templates/README.md
  • templates/pubsub-template.yml
  • .gitignore
  • templates/pubsub-init-job-template.yml
  • Makefile
  • README.md
  • templates/agent-tls-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-pubsub
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-with-istio
  • GitHub Check: upgrade
  • GitHub Check: e2e
🔇 Additional comments (8)
test/setup/env_setup.sh (1)

193-222: LGTM! Pub/Sub initialization flow is well-structured.

The Pub/Sub setup follows a clear pattern: deploy emulator → wait for readiness → initialize topics/subscriptions → cleanup. The hardcoded 5-second sleep at Line 205 is acceptable for test setup, though a readiness probe would be more robust in production.

test/e2e/pkg/cert_rotation_test.go (2)

32-51: LGTM! Skip logic correctly handles Pub/Sub scenario.

The certificate rotation tests are appropriately skipped when neither MQTT nor gRPC CA secrets are present, which is the expected state for Pub/Sub emulator testing. The error handling distinguishes between NotFound (expected for Pub/Sub) and actual errors.


123-125: Correctly uses early return in AfterAll.

The skip guard properly uses an early return instead of Skip(), which aligns with Ginkgo best practices for cleanup hooks. This addresses the concern from the previous review.

.github/workflows/e2e.yml (1)

119-145: LGTM! CI job for Pub/Sub is properly configured.

The new e2e-pubsub job follows the established pattern of other e2e jobs and appropriately sets MESSAGE_DRIVER_TYPE to pubsub. Testing with 2 replicas and TLS enabled provides good coverage.

templates/agent-template.yml (2)

72-85: LGTM! Pub/Sub parameters are well-defined.

The new Pub/Sub configuration parameters have appropriate defaults for emulator testing and are properly marked as required.


367-381: LGTM! Agent Pub/Sub configuration is correct.

The Secret properly configures Pub/Sub topics and subscriptions with consumer-specific subscription names. The insecure: true flag is appropriate for emulator usage in testing.

test/setup/deploy_agent.sh (1)

49-78: LGTM! Agent Pub/Sub initialization is well-implemented.

The agent subscription initialization flow properly creates consumer-specific subscriptions before deploying the agent. The job lifecycle management (create → wait → cleanup) follows best practices.

templates/pubsub-agent-init-job-template.yml (1)

58-120: LGTM! Agent subscription initialization logic is robust.

The inline Python script properly creates consumer-specific subscriptions with correct filter syntax (ce-clustername). The use of google-cloud-pubsub 2.34.0 (as addressed in the previous review) ensures up-to-date dependencies. Error handling gracefully manages AlreadyExists while propagating unexpected errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Fix all issues with AI Agents 🤖
In @test/setup/env_setup.sh:
- Around line 208-215: The test setup uses the OpenShift CLI via the oc process
invocation (seen around the pubsub init block and in deploy_agent.sh for agent
subscriptions) but env_setup.sh never installs or verifies oc like it does for
kind, step, and istioctl; update env_setup.sh to either install the OpenShift
CLI (oc) into the CI image or add a preflight check that verifies oc is on PATH
and exits with a clear error if missing, and mirror the same check/installation
where deploy_agent.sh relies on oc so Pub/Sub initialization won't fail when
enabled.
🧹 Nitpick comments (4)
hack/init-pubsub-emulator.py (1)

35-37: Consider catching more specific exceptions.

The broad Exception catch works for this CLI script, but catching google.api_core.exceptions.GoogleAPIError would be more precise and avoid masking unexpected programming errors.

🔎 Proposed fix
-        except Exception as e:
+        except exceptions.GoogleAPIError as e:
             print(f"  ✗ Error creating topic {topic_name}: {e}", file=sys.stderr)
             return False

Apply the same pattern at lines 62-64 and 113-115.

test/setup/env_setup.sh (1)

203-206: Consider replacing sleep 5 with a readiness probe or retry loop.

The fixed sleep is fragile and may cause flaky tests in slower environments. Consider using a retry loop to verify the emulator is responsive before proceeding.

🔎 Proposed improvement
   # Initialize topics and subscriptions in the emulator
-  # Wait a bit for the emulator to be fully ready
-  sleep 5
+  # Wait for the emulator to be fully ready
+  for i in {1..10}; do
+    if curl -s "http://${pubsub_host}:${pubsub_port}" >/dev/null 2>&1; then
+      break
+    fi
+    sleep 1
+  done
templates/pubsub-init-job-template.yml (2)

36-48: Consider adding Job failure constraints for robustness.

The Job spec doesn't define backoffLimit (defaults to 6) or activeDeadlineSeconds. Adding explicit constraints improves observability and prevents the Job from hanging indefinitely if the Pub/Sub emulator is unreachable or slow to respond.

🔎 Suggested Job constraints
     spec:
+      backoffLimit: 3
+      activeDeadlineSeconds: 300
       template:
         spec:
           restartPolicy: Never

This limits retries to 3 attempts and enforces a 5-minute timeout for the entire initialization process.


41-48: Add resource limits for production deployments.

The container doesn't specify resource requests or limits. For production environments, defining these prevents resource contention and ensures predictable scheduling.

🔎 Suggested resource constraints
           containers:
           - name: init
             image: registry.access.redhat.com/ubi9/python-311
             imagePullPolicy: IfNotPresent
+            resources:
+              requests:
+                memory: "128Mi"
+                cpu: "100m"
+              limits:
+                memory: "256Mi"
+                cpu: "500m"
             env:
             - name: PUBSUB_EMULATOR_HOST
               value: "${PUBSUB_HOST}:${PUBSUB_PORT}"

These values are conservative for a short-lived init Job that performs pip installation and lightweight API calls.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5b0276f and 7e3be17.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • go.mod
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (9)
  • templates/service-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • .github/workflows/e2e.yml
  • README.md
  • Makefile
  • templates/pubsub-agent-init-job-template.yml
  • go.mod
  • .gitignore
  • templates/pubsub-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-pubsub
  • GitHub Check: upgrade
  • GitHub Check: e2e
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
🔇 Additional comments (11)
templates/README.md (1)

33-55: LGTM!

The documentation clearly describes the Pub/Sub emulator templates, the topics and subscriptions created by the init jobs, and references the GCP-specific templates for production deployments.

hack/init-pubsub-emulator.py (3)

1-17: LGTM!

Clear module docstring with environment variable documentation and appropriate imports.


69-117: LGTM!

The agent subscription initialization logic is correct, properly creating filtered and unfiltered subscriptions for the specified consumer.


120-147: LGTM!

The main function correctly orchestrates initialization with sensible defaults. The PUBSUB_EMULATOR_HOST environment variable is used directly by the Google Cloud Pub/Sub client library, so reading it for logging purposes is appropriate.

templates/service-tls-template.yml (1)

275-278: LGTM!

The dynamic secret name maestro-${MESSAGE_DRIVER_TYPE} correctly enables support for multiple message driver types while maintaining the optional flag for backward compatibility.

templates/agent-template.yml (2)

72-86: LGTM!

The Pub/Sub parameters are well-defined with sensible defaults that align with the emulator configuration in other scripts.


367-381: LGTM!

The Pub/Sub secret configuration is correctly structured with proper GCP Pub/Sub resource paths and appropriate settings for the emulator (TLS disabled).

test/setup/deploy_agent.sh (2)

49-51: LGTM!

The Pub/Sub environment variables are consistent with the values defined in env_setup.sh.


55-78: LGTM!

The Pub/Sub agent initialization block correctly creates consumer-specific subscriptions before deploying the agent. The workflow mirrors the server-side initialization pattern.

templates/agent-tls-template.yml (2)

72-86: LGTM!

The Pub/Sub parameters are consistent with agent-template.yml, ensuring both TLS and non-TLS deployments use the same configuration.


383-397: LGTM!

The Pub/Sub secret configuration is consistent with agent-template.yml, maintaining parity between the TLS and non-TLS agent templates.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Fix all issues with AI Agents 🤖
In @test/setup/env_setup.sh:
- Around line 227-231: After waiting on job/pubsub-init, query its status with
kubectl get job pubsub-init in the ${namespace} and check the JSON fields
.status.succeeded and .status.failed (or .status.conditions) to ensure the init
job actually succeeded; if .status.succeeded is not >0 or .status.failed is >0,
log an error and exit non-zero without running the cleanup delete, otherwise
proceed to delete the job; update the script around the kubectl wait/delete
lines to perform this check and gate the kubectl delete on a successful job
status.
- Line 22: The oc_version variable in env_setup.sh is set to an outdated value;
update the oc_version assignment (oc_version="4.14.0") to the current stable
OpenShift CLI version (4.20.0 or 4.20) so scripts use the newer release; locate
the oc_version declaration in the file and change its value accordingly and run
any associated CI/setup scripts to verify compatibility.
🧹 Nitpick comments (2)
test/setup/env_setup.sh (1)

214-215: Consider using a readiness check instead of a fixed sleep.

The 5-second sleep may not be sufficient in all environments. Consider polling the emulator's health endpoint or using kubectl wait with a readiness condition.

🔎 Alternative approach using kubectl wait or health check
-  # Initialize topics and subscriptions in the emulator
-  # Wait a bit for the emulator to be fully ready
-  sleep 5
+  # Wait for emulator to be fully ready by checking its health
+  echo "Waiting for Pub/Sub emulator to be ready..."
+  until kubectl -n ${namespace} exec deploy/maestro-pubsub -- curl -sf http://localhost:8085/v1/projects/${pubsub_project_id}/topics >/dev/null 2>&1; do
+    echo "Emulator not ready yet, waiting..."
+    sleep 2
+  done
+  echo "Emulator is ready"
templates/pubsub-template.yml (1)

91-93: Optional: Consider removing redundant environment variable.

The PUBSUB_PROJECT_ID environment variable duplicates the project ID already passed via the --project flag in the command. Unless it's consumed by the emulator or debugging scripts, it can be removed.

🔎 Cleanup diff
            ports:
            - containerPort: 8085
              name: pubsub
-            env:
-            - name: PUBSUB_PROJECT_ID
-              value: ${PUBSUB_PROJECT_ID}
            volumeMounts:
            - name: pubsub-persistent-storage
              mountPath: /data
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 7e3be17 and 42a9caf.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • go.mod
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (6)
  • templates/agent-template.yml
  • templates/README.md
  • .github/workflows/e2e.yml
  • templates/pubsub-init-job-template.yml
  • go.mod
  • templates/pubsub-agent-init-job-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-pubsub
  • GitHub Check: upgrade
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e
  • GitHub Check: e2e-with-istio
🔇 Additional comments (25)
test/e2e/pkg/cert_rotation_test.go (2)

32-51: LGTM! Proper skip logic for Pub/Sub environments.

The implementation correctly skips certificate rotation tests when neither MQTT nor gRPC CA secrets exist (Pub/Sub doesn't use client certificates). The logic properly:

  • Checks both CA secrets independently
  • Only skips when both are missing
  • Preserves error handling for non-NotFound errors
  • Uses Skip() in BeforeAll (the correct lifecycle hook for skipping specs)

123-125: LGTM! Correct cleanup skip pattern.

The early return when skip is true properly prevents restoration logic from running when the test was skipped. This is the idiomatic pattern for conditional cleanup in Ginkgo lifecycle hooks, and addresses the past review comment correctly.

README.md (3)

45-79: LGTM! Clear Pub/Sub setup documentation.

The documentation effectively presents the dual-path setup (MQTT vs Pub/Sub) with clear step labels (3a/3b). The note about the google-cloud-pubsub Python package prerequisite is helpful for users.


114-121: LGTM! Clear Pub/Sub run instructions.

The documentation clearly shows how to run Maestro with Pub/Sub, including the required flags and config file reference.


360-365: LGTM! Clear test environment Pub/Sub support.

The documentation effectively shows how to enable Pub/Sub in the KinD test environment using the MESSAGE_DRIVER_TYPE environment variable.

.gitignore (1)

50-50: LGTM! Appropriate ignore rule for Pub/Sub configuration.

The addition of secrets/pubsub.config aligns with the existing pattern for ignoring secret files and prevents accidental commits of Pub/Sub emulator configuration.

templates/service-template.yml (1)

219-222: LGTM! Correct dynamic secret reference pattern.

The change to use maestro-${MESSAGE_DRIVER_TYPE} enables multi-broker support (MQTT, gRPC, Pub/Sub). The optional: true flag is the correct approach, as clarified in past review discussions: it allows gRPC deployments (where no secret is created) while mqtt/pubsub deployments will fail fast during initialization if the required secret is missing.

templates/service-tls-template.yml (1)

275-278: LGTM! Consistent dynamic secret reference pattern.

This change mirrors the approach in service-template.yml, using maestro-${MESSAGE_DRIVER_TYPE} to enable multi-broker support. The pattern is correct and consistent across both templates.

test/setup/env_setup.sh (1)

55-62: LGTM! OpenShift CLI installation follows established pattern.

The installation block correctly checks for oc, downloads from the official mirror, extracts, installs to /usr/local/bin, and cleans up properly. This addresses the previous concern about oc availability.

templates/pubsub-template.yml (2)

80-87: LGTM! Emulator command is correctly configured.

The emulator start command properly binds to 0.0.0.0:8085 to accept connections from other pods and passes the project ID via the --project flag.


107-116: LGTM! Secret structure aligns with agent configuration.

The config.yaml properly defines the project ID, endpoint, and topic/subscription mappings that match the agent secret structure introduced in templates/agent-tls-template.yml. The disableTLS: true setting is appropriate for the emulator.

test/setup/deploy_agent.sh (1)

49-51: LGTM! Pub/Sub environment variables are consistent.

The exported variables match the values used in test/setup/env_setup.sh and align with the template defaults, ensuring consistency across server and agent setup.

templates/agent-tls-template.yml (2)

72-85: LGTM! Pub/Sub parameters follow established conventions.

The parameter definitions are consistent with the MQTT parameters above and use appropriate default values that match the emulator configuration.


383-397: LGTM! Agent Pub/Sub secret structure is correct.

The secret properly configures consumer-specific subscriptions (sourceevents-${CONSUMER_NAME}, sourcebroadcast-${CONSUMER_NAME}) and shared topics (agentevents, agentbroadcast), enabling proper message routing between server and agent.

hack/init-pubsub-emulator.py (4)

1-11: LGTM! Script header and documentation are clear.

The module docstring properly explains the script's purpose and documents the expected environment variables.


19-66: LGTM! Server initialization logic is robust.

The function correctly creates topics and subscriptions with appropriate filters for CloudEvents attributes. The broad exception handling is appropriate here to catch any Pub/Sub API errors and provide useful feedback.


69-117: LGTM! Agent subscription initialization is correct.

The function properly creates consumer-specific subscriptions with appropriate filtering. The broadcast subscription correctly has no filter to receive all broadcast messages.


120-147: LGTM! Main function orchestrates initialization correctly.

The function properly sequences server initialization before agent initialization, handles environment variables with sensible defaults, and ensures proper exit codes for CI/CD integration.

Makefile (7)

65-69: LGTM!

The Pub/Sub configuration variables follow the established patterns for MQTT broker configuration, with appropriate defaults for local development and testing.


88-89: LGTM!

The documentation correctly reflects the addition of Pub/Sub as a message driver option.


139-145: LGTM!

The help output additions clearly document the new Pub/Sub targets and follow the established format.


331-333: LGTM!

The Pub/Sub parameters are correctly passed to the template generation, following the same pattern as the MQTT configuration.


446-450: LGTM!

The Pub/Sub emulator setup follows the established patterns for database and MQTT broker setup. The port mapping and container configuration are correct.


452-455: LGTM!

The teardown target correctly stops and removes the Pub/Sub emulator container, consistent with the cleanup patterns for other services.


457-460: Python prerequisites are already documented in the README.

The google-cloud-pubsub requirement is documented at README line 77: "Requires google-cloud-pubsub Python package (pip3 install google-cloud-pubsub)". The README also shows the correct workflow sequence (make pubsub/setup followed by make pubsub/init). No action needed; the current documentation is sufficient.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
test/setup/env_setup.sh (1)

203-240: Well-structured Pub/Sub initialization with proper error handling.

The initialization flow properly validates job success before cleanup (lines 231-236), which addresses the previous review concern. The deployment wait, job creation, and cleanup sequence is well-organized.

However, the arbitrary 5-second sleep (line 215) before initialization is fragile. Consider replacing it with an active readiness check.

🔎 Proposed fix to replace sleep with readiness check
-  # Initialize topics and subscriptions in the emulator
-  # Wait a bit for the emulator to be fully ready
-  sleep 5
+  # Wait for emulator to be ready to accept requests
+  echo "Waiting for Pub/Sub emulator to be ready..."
+  for i in {1..30}; do
+    if kubectl -n ${namespace} exec deploy/maestro-pubsub -- curl -s http://localhost:8085 > /dev/null 2>&1; then
+      echo "Pub/Sub emulator is ready"
+      break
+    fi
+    if [ $i -eq 30 ]; then
+      echo "ERROR: Pub/Sub emulator failed to become ready" >&2
+      exit 1
+    fi
+    sleep 1
+  done
hack/init-pubsub-emulator.py (1)

19-66: Consider more specific exception handling.

The function properly propagates errors by returning False and logging to stderr, which is then checked in main(). However, catching bare Exception (lines 35-37, 62-64) is overly broad. Consider catching specific exceptions like google.api_core.exceptions.GoogleAPIError or letting unexpected exceptions propagate naturally.

🔎 Proposed refactor for more specific exception handling
 def init_server_topics_and_subscriptions(project_id: str):
     """Initialize topics and subscriptions for the Maestro server."""
     publisher = pubsub_v1.PublisherClient()
     subscriber = pubsub_v1.SubscriberClient()
 
     # Topics to create
     topics = ['sourceevents', 'sourcebroadcast', 'agentevents', 'agentbroadcast']
 
     print("Creating topics...")
     for topic_name in topics:
         topic_path = publisher.topic_path(project_id, topic_name)
         try:
             publisher.create_topic(request={"name": topic_path})
             print(f"  ✓ Created topic: {topic_name}")
         except exceptions.AlreadyExists:
             print(f"  - Topic already exists: {topic_name}")
-        except Exception as e:
+        except exceptions.GoogleAPIError as e:
             print(f"  ✗ Error creating topic {topic_name}: {e}", file=sys.stderr)
             return False

Apply similar changes to subscription creation and agent initialization.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 42a9caf and a7fbefa.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • go.mod
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (7)
  • templates/agent-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • templates/pubsub-agent-init-job-template.yml
  • test/setup/deploy_agent.sh
  • templates/README.md
  • README.md
  • .gitignore
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: e2e-pubsub
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: upgrade
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
🔇 Additional comments (8)
go.mod (1)

60-60: LGTM! Clean dependency resolution after upstream merge.

The SDK dependency has been correctly updated to the official upstream version following the merge of PR #182. This removes the temporary fork dependency and uses the proper release channel.

templates/agent-tls-template.yml (2)

72-85: LGTM! Pub/Sub parameters follow consistent patterns.

The new Pub/Sub parameters are well-defined with appropriate defaults for the emulator environment. The required: true flag with sensible defaults ensures the template can be used out-of-the-box for development while allowing production overrides.


383-397: LGTM! Pub/Sub secret configuration aligns with emulator usage.

The disableTLS: true setting is appropriate here since the default endpoint (maestro-pubsub.maestro:8085) points to the Pub/Sub emulator, which doesn't support TLS. The topic and subscription naming patterns are consistent with the MQTT configuration structure, and the per-consumer subscription naming (sourceevents-${CONSUMER_NAME}) provides proper isolation.

.github/workflows/e2e.yml (1)

119-145: LGTM! Pub/Sub e2e job follows established patterns.

The new e2e-pubsub job is well-structured and mirrors the configuration of existing e2e jobs. The environment variables (2 replicas, TLS enabled, Pub/Sub driver) provide appropriate coverage for testing the Pub/Sub message driver in CI alongside MQTT and gRPC variants.

test/setup/env_setup.sh (1)

55-62: LGTM! OpenShift CLI installation properly implemented.

The installation follows the same pattern as other tools (kind, step), downloads from the official mirror, and includes proper cleanup. This addresses the previous concern about oc availability for Pub/Sub initialization.

templates/pubsub-template.yml (1)

1-116: LGTM! Well-structured Pub/Sub emulator template.

The template properly defines all necessary resources (Service, Deployment, Secret) with appropriate configuration for the Google Cloud Pub/Sub emulator. The use of emptyDir for storage (lines 98-99) is appropriate for an emulator in testing scenarios where persistence across pod restarts is not required.

The image tag :emulators (line 22) has been confirmed as a deliberate choice for stability based on previous discussion.

templates/pubsub-init-job-template.yml (1)

52-112: LGTM! Pub/Sub initialization Job properly configured with error handling.

The inline Python script correctly:

  • Creates all required topics (sourceevents, sourcebroadcast, agentevents, agentbroadcast)
  • Creates subscriptions with appropriate filters
  • Propagates exceptions on failure (lines 82, 109), ensuring the Job fails when initialization fails
  • Uses the updated google-cloud-pubsub version (2.34.0)

The subscription filter attributes.ce-originalsource="maestro" (line 86) has been confirmed as correct based on previous discussion—it matches events published by the agent via the SDK.

Makefile (1)

65-69: LGTM! Pub/Sub support well-integrated following existing patterns.

The Makefile changes consistently integrate Pub/Sub support:

  • Configuration variables (lines 65-69) follow the same naming convention as MQTT variables
  • Help documentation (lines 143-145) clearly describes the new targets
  • Template parameter passing (lines 331-333) mirrors the MQTT parameter pattern
  • Lifecycle targets (lines 446-460) match the structure of existing mqtt/setup, mqtt/teardown targets

The pubsub/init target (lines 457-460) properly invokes the Python initialization script with the correct environment variables.

Also applies to: 143-145, 331-333, 446-460

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
test/setup/env_setup.sh (1)

203-240: LGTM! Pub/Sub initialization flow is well-structured with proper error handling.

The implementation correctly:

  • Deploys the Pub/Sub emulator and waits for availability
  • Uses oc process to create the initialization job from the template
  • Verifies job success by checking .status.succeeded before cleanup
  • Provides clear error messages and logs on failure
  • Cleans up the initialization job after successful completion

This addresses the past review concerns about job status verification.

💡 Optional: Replace hardcoded sleep with readiness probe

Line 215 uses sleep 5 to wait for the emulator to be fully ready. Consider replacing this with a proper readiness check:

- # Wait a bit for the emulator to be fully ready
- sleep 5
+ # Wait for the emulator to be fully ready
+ kubectl -n ${namespace} wait --for=condition=ready pod -l app=maestro-pubsub --timeout=60s

This would be more reliable and potentially faster than a fixed sleep duration.

test/setup/deploy_agent.sh (1)

61-69: Use exported variables instead of hardcoded values for consistency.

Lines 65-67 hardcode the Pub/Sub configuration values, but lines 49-51 already export these as variables. If the exported variables are modified, the oc process command won't pick up those changes.

♻️ Refactor to use exported variables
   # Create agent-specific subscriptions using the template
   oc process \
     --filename="${PWD}/templates/pubsub-agent-init-job-template.yml" \
     --local="true" \
     --param="NAMESPACE=${agent_namespace}" \
-    --param="PUBSUB_HOST=maestro-pubsub.${namespace}" \
-    --param="PUBSUB_PORT=8085" \
-    --param="PUBSUB_PROJECT_ID=maestro-test" \
+    --param="PUBSUB_HOST=${pubsub_host}" \
+    --param="PUBSUB_PORT=${pubsub_port}" \
+    --param="PUBSUB_PROJECT_ID=${pubsub_project_id}" \
     --param="CONSUMER_NAME=${consumer_name}" \
   | kubectl apply -f -
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between a7fbefa and 3e7f4ac.

📒 Files selected for processing (16)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (6)
  • templates/README.md
  • README.md
  • templates/agent-tls-template.yml
  • templates/pubsub-template.yml
  • .github/workflows/e2e.yml
  • templates/agent-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: e2e-pubsub
  • GitHub Check: upgrade
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
🔇 Additional comments (11)
.gitignore (1)

50-50: LGTM! Pub/Sub configuration exclusion is appropriate.

The addition of secrets/pubsub.config follows the established pattern for secret files and aligns with the Pub/Sub integration introduced in this PR.

templates/service-template.yml (1)

219-222: LGTM! Dynamic secret naming correctly supports multiple message drivers.

The change from maestro-mqtt to maestro-${MESSAGE_DRIVER_TYPE} enables support for MQTT, gRPC, and Pub/Sub drivers. The optional: true flag is correctly retained to support the gRPC case where no secret is created (as confirmed in the past review discussion).

templates/service-tls-template.yml (1)

275-278: LGTM! Consistent dynamic secret handling across TLS template.

The dynamic secret naming change aligns perfectly with service-template.yml, ensuring consistent message driver support across both TLS and non-TLS deployments.

test/e2e/pkg/cert_rotation_test.go (2)

32-52: LGTM! Skip logic correctly handles certificate-less Pub/Sub deployments.

The implementation properly:

  • Checks for both MQTT and gRPC CA secrets, skipping only when neither exists (Pub/Sub case)
  • Distinguishes NotFound errors from actual errors that should fail the test
  • Includes clear comments explaining that Pub/Sub emulator doesn't use client certificates
  • Correctly positions the check in BeforeAll before any certificate operations

123-125: LGTM! Early return pattern correctly implemented.

The AfterAll hook now uses an early return instead of Skip(), which is the correct pattern for conditional cleanup in Ginkgo lifecycle hooks, as confirmed in the past review discussion.

test/setup/env_setup.sh (1)

55-62: LGTM! OpenShift CLI installation ensures oc availability.

The installation block properly handles the missing oc dependency, downloads the correct version, and cleans up temporary files. This addresses the past review concern about oc availability for the oc process commands used in Pub/Sub initialization.

templates/pubsub-agent-init-job-template.yml (1)

1-120: LGTM! Agent initialization template is well-structured.

The template correctly:

  • Uses the latest google-cloud-pubsub version (2.34.0)
  • Creates consumer-specific subscriptions with appropriate filters
  • Handles idempotent initialization with AlreadyExists exception handling
  • Properly propagates errors for non-idempotent failures
hack/init-pubsub-emulator.py (1)

1-147: LGTM! Initialization script follows appropriate error handling patterns.

The script correctly orchestrates Pub/Sub resource initialization with proper error handling. The static analysis warnings about broad Exception catches (lines 35, 62, 113, 145) are false positives in this context:

  • Initialization scripts must handle all potential Google Cloud API exceptions
  • Errors are logged before returning False or raising
  • main() checks return values and exits with status 1 on failure
  • Top-level exception handler (line 145) catches unexpected errors

This pattern is appropriate for initialization jobs where any failure should fail fast.

Makefile (2)

65-69: LGTM! Pub/Sub configuration variables follow existing patterns.

The new Pub/Sub variables are consistent with the existing MQTT broker configuration structure and properly integrated into the template rendering workflow.


446-461: LGTM! Pub/Sub lifecycle targets follow established conventions.

The pubsub/setup, pubsub/teardown, and pubsub/init targets mirror the existing db/ and mqtt/ patterns, providing a consistent interface for managing Pub/Sub emulator lifecycle.

templates/pubsub-init-job-template.yml (1)

1-112: LGTM! Server initialization template is well-structured.

The template correctly:

  • Uses the latest google-cloud-pubsub version (2.34.0)
  • Creates all required topics for both server and agent communication
  • Creates server-side subscriptions with appropriate filters
  • Properly propagates errors via raise statements (lines 82, 109)
  • Handles idempotent initialization with AlreadyExists exception handling

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/setup/deploy_agent.sh (1)

61-69: Use exported variables instead of hardcoded values.

The oc process command hardcodes values on lines 65-67 that duplicate the environment variables exported on lines 49-51. This violates the DRY principle and makes the code harder to maintain—if these values need to change, both locations must be updated.

♻️ Refactor to use exported variables
  # Create agent-specific subscriptions using the template
  oc process \
    --filename="${PWD}/templates/pubsub-agent-init-job-template.yml" \
    --local="true" \
    --param="NAMESPACE=${agent_namespace}" \
-    --param="PUBSUB_HOST=maestro-pubsub.${namespace}" \
-    --param="PUBSUB_PORT=8085" \
-    --param="PUBSUB_PROJECT_ID=maestro-test" \
+    --param="PUBSUB_HOST=${pubsub_host}" \
+    --param="PUBSUB_PORT=${pubsub_port}" \
+    --param="PUBSUB_PROJECT_ID=${pubsub_project_id}" \
    --param="CONSUMER_NAME=${consumer_name}" \
  | kubectl apply -f -
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 3e7f4ac and 355c8c8.

📒 Files selected for processing (16)
  • .github/workflows/e2e.yml
  • .gitignore
  • Makefile
  • README.md
  • hack/init-pubsub-emulator.py
  • templates/README.md
  • templates/agent-template.yml
  • templates/agent-tls-template.yml
  • templates/pubsub-agent-init-job-template.yml
  • templates/pubsub-init-job-template.yml
  • templates/pubsub-template.yml
  • templates/service-template.yml
  • templates/service-tls-template.yml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (9)
  • .github/workflows/e2e.yml
  • templates/pubsub-init-job-template.yml
  • templates/agent-template.yml
  • templates/service-tls-template.yml
  • README.md
  • templates/agent-tls-template.yml
  • templates/pubsub-template.yml
  • Makefile
  • .gitignore
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py

35-35: Do not catch blind exception: Exception

(BLE001)


62-62: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


145-145: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e
  • GitHub Check: e2e-pubsub
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
🔇 Additional comments (9)
templates/service-template.yml (1)

219-222: LGTM! Dynamic secret naming correctly implemented.

The dynamic secret reference (maestro-${MESSAGE_DRIVER_TYPE}) with optional: true is the correct approach to support all broker types (mqtt, pubsub, grpc). For grpc deployments where no secret is created, the optional flag allows the pod to start. For mqtt/pubsub deployments, the secret will be created and validated at application startup.

templates/pubsub-agent-init-job-template.yml (1)

1-120: LGTM! Well-structured Pub/Sub agent initialization template.

The template is well-designed:

  • Required parameters are properly marked
  • The Python script handles both filtered (sourceevents-{consumer_name}) and unfiltered (sourcebroadcast-{consumer_name}) subscriptions
  • Error handling includes AlreadyExists for idempotency and general exceptions with proper error messages
  • The CloudEvents filter syntax (attributes.ce-clustername="{consumer_name}") is correct
hack/init-pubsub-emulator.py (1)

1-147: LGTM! Appropriate error handling for initialization script.

The script is well-structured with clear separation of server and agent initialization logic. The bare Exception catches flagged by static analysis (lines 35, 62, 113, 145) are appropriate here:

  • They provide defensive error handling for an initialization script where any unexpected error should be caught, logged, and cause the script to fail gracefully
  • Each catch block logs the error with context before returning False or exiting
  • The script properly distinguishes AlreadyExists (idempotent, informational) from genuine errors (failure)

This pattern is correct for a setup/initialization script that should never leave the environment in a partially-configured state.

test/e2e/pkg/cert_rotation_test.go (2)

32-51: LGTM! Correct skip mechanism for certificate rotation tests.

The skip logic properly handles scenarios where certificate-based brokers (MQTT/gRPC) are not deployed:

  • Checks for both CA secrets before proceeding
  • Distinguishes NotFound errors (skip case) from actual errors (fail)
  • Sets the skip flag and calls Skip() to mark the test as skipped
  • The comment at lines 36-37 clearly explains why this is needed (Pub/Sub doesn't use client certificates)

123-125: LGTM! Correct cleanup skip pattern.

The early return in AfterAll when skip==true is the correct pattern. This avoids attempting to restore certificates and clean up resources that were never created when the test was skipped.

test/setup/env_setup.sh (2)

55-62: LGTM! OpenShift CLI installation added.

The oc CLI installation logic follows the same pattern as the other tools (kind, step, istioctl) and correctly downloads from the official OpenShift mirror for the specified version.


203-240: LGTM! Well-structured Pub/Sub emulator initialization.

The Pub/Sub setup flow is comprehensive and robust:

  • Deploys the emulator and waits for availability
  • Includes a brief sleep for full readiness
  • Uses oc process to render the initialization template with appropriate parameters
  • Waits for job completion with a reasonable timeout (120s)
  • Validates job success before cleanup (lines 231-236), logging errors and exiting non-zero on failure
  • Cleans up the init job after successful completion

This addresses all previous review concerns and follows best practices for initialization jobs.

templates/README.md (1)

33-56: LGTM! Clear and comprehensive documentation.

The documentation for the Pub/Sub emulator and initialization templates is well-structured and provides helpful details about topics, subscriptions, and filters. The distinction between development/testing templates and production GCP templates is clear.

test/setup/deploy_agent.sh (1)

71-86: Excellent job completion verification and error handling.

The implementation properly waits for job completion, verifies success before cleanup, and provides helpful error messages with logs when failures occur. This addresses the concern from the previous review and follows best practices for Kubernetes job orchestration.

Signed-off-by: Morven Cao <lcao@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @charts/maestro-agent/templates/pubsub-init-hook.yaml:
- Around line 56-60: The function extract_topic_name is dead code and should be
removed; delete its entire definition (the def extract_topic_name(...) block)
from the script and run a quick search for any references to extract_topic_name
to confirm none exist and that removing it introduces no usage break; if there
are related unused imports or comments tied specifically to that function,
remove those as well.
🧹 Nitpick comments (4)
charts/maestro-server/templates/pubsub-emulator.yaml (1)

1-62: Consider adding resource limits and health probes for the emulator deployment.

The Pub/Sub emulator deployment is missing resource limits and liveness/readiness probes. While this is intended for development/testing, adding these would improve stability:

  1. Resource limits - prevent the emulator from consuming excessive resources
  2. Health probes - ensure Helm waits for the emulator to be ready before dependent services start

Note: The YAMLlint syntax error on line 1 is a false positive — Helm conditionals ({{- if ... }}) are valid Helm template syntax.

♻️ Suggested improvements
         - name: pubsub-emulator
           image: {{ .Values.pubsubEmulator.image }}
           imagePullPolicy: IfNotPresent
+          resources:
+            requests:
+              cpu: 100m
+              memory: 256Mi
+            limits:
+              cpu: 500m
+              memory: 512Mi
           command:
             - gcloud
             - beta
             - emulators
             - pubsub
             - start
             - --host-port=0.0.0.0:8085
             - --project={{ .Values.pubsubEmulator.projectID }}
           ports:
             - containerPort: 8085
               name: pubsub
               protocol: TCP
+          readinessProbe:
+            tcpSocket:
+              port: 8085
+            initialDelaySeconds: 5
+            periodSeconds: 10
+          livenessProbe:
+            tcpSocket:
+              port: 8085
+            initialDelaySeconds: 10
+            periodSeconds: 30
charts/maestro-agent/templates/pubsub-init-hook.yaml (1)

37-39: Runtime pip install adds latency and external network dependency.

Installing google-cloud-pubsub at runtime during each hook execution introduces latency and depends on external network availability. For a more robust solution, consider using a pre-built image with the library already installed or caching the pip packages.

This is acceptable for development/testing but may be fragile if PyPI is unreachable.

charts/maestro-server/templates/pubsub-init-hook.yaml (1)

34-93: Consider pinning the Python image digest for reproducibility.

The script logic is well-structured with proper idempotency handling via AlreadyExists exceptions. One minor note: runtime pip install works for test environments but could be fragile if PyPI has issues. Consider building a custom init image with dependencies pre-installed if this becomes flaky.

test/e2e/pkg/cert_rotation_test.go (1)

360-362: Consider using a random serial number for certificates.

Using time.Now().Unix() for the serial number could produce collisions if multiple certificates are generated in the same second. While unlikely to cause issues in this test context, best practice is to use a cryptographically random serial:

♻️ Suggested improvement
-		SerialNumber: big.NewInt(time.Now().Unix()),
+		SerialNumber: func() *big.Int {
+			serialNumber, _ := rand.Int(rand.Reader, new(big.Int).Lsh(big.NewInt(1), 128))
+			return serialNumber
+		}(),
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 355c8c8 and 5a4a443.

📒 Files selected for processing (11)
  • .github/workflows/e2e.yml
  • charts/maestro-agent/templates/pubsub-init-hook.yaml
  • charts/maestro-agent/templates/secret.yaml
  • charts/maestro-agent/values.yaml
  • charts/maestro-server/templates/pubsub-emulator.yaml
  • charts/maestro-server/templates/pubsub-init-hook.yaml
  • charts/maestro-server/templates/pubsub.yaml
  • charts/maestro-server/values.yaml
  • test/e2e/pkg/cert_rotation_test.go
  • test/setup/deploy_agent.sh
  • test/setup/deploy_server.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/e2e.yml
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/maestro-agent/templates/pubsub-init-hook.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

charts/maestro-server/templates/pubsub-emulator.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

charts/maestro-server/templates/pubsub-init-hook.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: verify
  • GitHub Check: e2e
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-pubsub
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: e2e-with-istio
  • GitHub Check: upgrade
🔇 Additional comments (13)
charts/maestro-server/values.yaml (2)

88-89: LGTM! New Pub/Sub configuration fields are well-structured.

The endpoint and disableTLS fields follow the existing configuration patterns and have sensible defaults (empty endpoint, TLS enabled by default).


99-106: LGTM! Pub/Sub emulator configuration is appropriately structured.

The emulator is disabled by default (safe for production), and the configuration includes all necessary fields for deployment. Consider adding inline comments documenting that this is for development/testing only, similar to the PostgreSQL and Mosquitto sections.

charts/maestro-agent/templates/pubsub-init-hook.yaml (1)

79-101: LGTM! Subscription creation with proper error handling.

The subscription creation logic correctly handles the AlreadyExists exception gracefully while re-raising unexpected errors. The filter expression for sourceevents subscription properly routes messages based on ce-clustername attribute.

charts/maestro-agent/values.yaml (1)

61-62: LGTM! Pub/Sub configuration fields align with server-side changes.

The new endpoint and disableTLS fields mirror the server-side configuration in charts/maestro-server/values.yaml, ensuring consistency across the deployment.

charts/maestro-agent/templates/secret.yaml (1)

65-68: LGTM! Conditional endpoint and explicit disableTLS handling.

The conditional inclusion of endpoint (only when set) and unconditional emission of disableTLS follows the same pattern as the server-side pubsub.yaml template, ensuring consistency across both deployments.

charts/maestro-server/templates/pubsub.yaml (1)

14-17: LGTM! Endpoint and TLS configuration additions are consistent.

The conditional endpoint and unconditional disableTLS fields mirror the agent-side configuration in charts/maestro-agent/templates/secret.yaml, ensuring consistent Pub/Sub configuration across server and agent deployments.

test/setup/deploy_agent.sh (1)

113-127: LGTM! Pub/Sub test configuration is well-aligned with emulator setup.

The configuration correctly:

  • Uses maestro-test project ID matching the emulator default in values.yaml
  • Points to the emulator service endpoint on port 8085
  • Disables TLS (appropriate for the emulator which doesn't support TLS)
  • Uses consumer-specific subscription names (sourceevents-${consumer_name}, sourcebroadcast-${consumer_name}) that align with the init hook's subscription creation logic
charts/maestro-server/templates/pubsub-init-hook.yaml (1)

1-12: LGTM on Helm hook configuration.

The Helm hook annotations are correctly configured. The YAMLlint error on line 1 is a false positive—{{- if ... }} is valid Helm templating syntax that YAMLlint doesn't understand.

test/e2e/pkg/cert_rotation_test.go (2)

36-52: Well-designed skip logic for Pub/Sub compatibility.

The conditional skip logic correctly identifies when certificate rotation tests are inapplicable (Pub/Sub mode). The error handling properly distinguishes NotFound from unexpected errors, ensuring tests fail fast on real issues while gracefully skipping for Pub/Sub.


123-127: LGTM on AfterAll guard.

The early return when skip is true correctly prevents cleanup attempts for resources that were never initialized.

test/setup/deploy_server.sh (3)

119-128: LGTM on Pub/Sub message broker configuration.

The Pub/Sub configuration is well-structured with appropriate values for the test environment. The topic and subscription paths follow the correct format (projects/{project}/topics/{topic}).


203-210: Configuration consistency verified.

The emulator service configuration (maestro-pubsub:8085) correctly matches the endpoint specified in the messageBroker.pubsub section. Good alignment.


221-222: Clarify the need for forced rollout restart.

The helm upgrade --install ... --wait on line 218 should already wait for the deployment to be ready. The explicit rollout restart followed by rollout status adds ~30+ seconds to deployment time and forces pod recreation.

Is this working around a specific timing issue (e.g., Pub/Sub init hook completing after maestro pods start)? If so, consider adding a comment explaining the rationale. If the init hook is the concern, the hook's negative weight (-5) should cause it to run before other post-install hooks but after the deployment is created.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
charts/maestro-agent/templates/pubsub-init-hook.yaml (2)

20-32: Consider adding resource limits and backoffLimit.

The container lacks resource requests/limits, and the Job doesn't specify backoffLimit. While defaults may suffice for e2e testing, explicit configuration improves reliability and prevents resource contention.

♻️ Suggested improvement
     spec:
       restartPolicy: Never
+      backoffLimit: 3
       containers:
         - name: init
           image: registry.access.redhat.com/ubi9/python-311
           imagePullPolicy: IfNotPresent
+          resources:
+            requests:
+              cpu: 100m
+              memory: 128Mi
+            limits:
+              cpu: 200m
+              memory: 256Mi
           env:

36-42: Runtime pip install introduces network dependency.

Installing google-cloud-pubsub at Job runtime requires network access to PyPI on every run. This is pragmatic for e2e testing, but be aware the Job will fail if PyPI is unreachable. A pre-built image with dependencies baked in would be more robust for production use cases.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5a4a443 and 74eb155.

📒 Files selected for processing (1)
  • charts/maestro-agent/templates/pubsub-init-hook.yaml
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/maestro-agent/templates/pubsub-init-hook.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Red Hat Konflux / maestro-on-pull-request
  • GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
  • GitHub Check: e2e-broadcast-subscription
  • GitHub Check: upgrade
  • GitHub Check: e2e-grpc-broker
  • GitHub Check: e2e-pubsub
  • GitHub Check: e2e-with-istio
  • GitHub Check: e2e
🔇 Additional comments (2)
charts/maestro-agent/templates/pubsub-init-hook.yaml (2)

1-13: LGTM - Helm hook configuration is appropriate.

The static analysis error from YAMLlint is a false positive—YAMLlint doesn't understand Helm template syntax ({{-). The hook annotations correctly configure this Job to run before install/upgrade with appropriate cleanup policies.


50-92: LGTM - Subscription creation logic is correct and idempotent.

The Python script properly handles the AlreadyExists exception for idempotent behavior during upgrades. The filter syntax for sourceevents subscription is correct. Note that this assumes topics (sourceevents, sourcebroadcast) are already created by the server-side init hook.

--timeout 5m

kubectl wait deploy/maestro -n $namespace --for condition=Available=True --timeout=300s
kubectl rollout restart deploy/maestro -n ${namespace}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to restart maestro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when deploying the maestro server via a helm chart, the pubsub-init job is executed as a post-hook. this means maestro server may start before the pubsub-init job completes, causing cloudevents subscriptions to not be started in time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. let us restart only for pubsub.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

@clyang82 clyang82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morvencao morvencao force-pushed the br_add_pubsub_e2e branch 2 times, most recently from aa2cf85 to 2b7d2b3 Compare January 14, 2026 02:44
Signed-off-by: Morven Cao <lcao@redhat.com>
Signed-off-by: Morven Cao <lcao@redhat.com>
@clyang82 clyang82 merged commit 0f527cd into openshift-online:main Jan 14, 2026
13 checks passed
@morvencao morvencao deleted the br_add_pubsub_e2e branch January 14, 2026 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants