add e2e test for pubsub.#454
Conversation
WalkthroughAdds Pub/Sub support across CI, tests, deployment scripts, and Helm charts: new e2e CI job for pubsub, test skip for missing CA secrets, Pub/Sub client and emulator Helm templates/values/init jobs, and test deploy script updates to inject Pub/Sub configuration and conditional rollout handling. Changes
Sequence Diagram(s)sequenceDiagram
actor Operator
participant Helm
participant Kubernetes
participant PubSubEmulator
participant InitJob
participant Maestro
Operator->>Helm: helm install/upgrade (values include pubsub & emulator)
Helm->>Kubernetes: create Service, Deployment, Secrets, Hook Job
Kubernetes->>PubSubEmulator: start emulator Deployment (if enabled)
Kubernetes->>InitJob: run pubsub-init-hook Job
InitJob->>PubSubEmulator: create topics & subscriptions via client
InitJob-->>Kubernetes: Job completes
Kubernetes->>Maestro: deploy Maestro components with pubsub config (endpoint/disableTLS)
Maestro->>PubSubEmulator: connect, publish & subscribe to topics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro Cache: Disabled due to data retention organization setting Knowledge base: Disabled due to 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
✏️ Tip: You can disable this entire section by setting Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
go.mod (1)
209-210: Establish a migration plan for the fork dependency.Using a personal fork (morvencao/ocm-sdk-go) as a replace directive creates a temporal dependency that requires clear documentation and a removal plan. While the fork is currently accessible and maintained, this pattern is not sustainable long-term—it ties the build to an external personal account and PR status.
PR #182 ("support test mode for pubsub driver") is currently open (updated Dec 25). Ensure:
- Once upstream PR #182 is merged, immediately remove this replace directive and update the require statement on line 60 to reference the merged version.
- Document this temporary workaround in the code or PR description with the expected timeline for removal (tied to PR #182's merge).
- Add a tracking issue or comment linking this dependency to PR #182 so the team knows when the replacement can be removed.
🧹 Nitpick comments (2)
test/e2e/pkg/cert_rotation_test.go (1)
80-83: Clarify the assertion failure message.The message "no CA secrets found; certificate rotation did not run" is misleading at this point in the code. If execution reaches line 83, it means
BeforeAlldid not skip, so at least one CA secret exists. A falserotatedvalue would indicate rotation failed for a different reason (e.g., invalid secret data).🔎 Suggested improvement
-Expect(rotated).To(BeTrue(), "no CA secrets found; certificate rotation did not run") +Expect(rotated).To(BeTrue(), "expected at least one certificate to be rotated")test/setup/env_setup.sh (1)
193-222: LGTM with a minor observation about the sleep.The Pub/Sub emulator setup follows the established pattern for other message brokers. The
sleep 5on line 205 is a pragmatic approach for waiting for emulator readiness, though in slower CI environments this might occasionally be insufficient.If flakiness is observed in CI, consider replacing the fixed sleep with a retry loop that checks the emulator's health, for example by attempting a simple Pub/Sub API call until it succeeds.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (17)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdgo.modhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: e2e-with-istio
- GitHub Check: e2e-pubsub
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e-grpc-broker
- GitHub Check: upgrade
- GitHub Check: e2e
🔇 Additional comments (23)
test/e2e/pkg/cert_rotation_test.go (2)
32-32: LGTM: Skip flag declaration.The skip flag is appropriately used to track whether certificate rotation tests should be skipped when CA secrets are absent.
34-52: LGTM: Conditional test execution based on CA secret presence.The logic correctly handles three scenarios:
- Both CA secrets missing → skip (Pub/Sub emulator scenario)
- At least one CA secret present → continue (MQTT/gRPC scenario)
- Unexpected errors → fail
This aligns well with the PR objective of supporting Pub/Sub as an alternative message driver.
.gitignore (1)
50-50: LGTM!The new ignore rule for
secrets/pubsub.configfollows the existing pattern for other secret configuration files in this directory.README.md (3)
45-79: LGTM!The documentation clearly explains both MQTT and Pub/Sub setup paths, maintains consistency with existing style, and properly notes the Python package dependency for Pub/Sub initialization.
114-122: LGTM!The Pub/Sub runtime instructions are clear and follow the same pattern as the MQTT instructions.
361-366: LGTM!The KinD cluster instructions for Pub/Sub are concise and align with the MESSAGE_DRIVER_TYPE environment variable used throughout the codebase.
Makefile (2)
65-70: LGTM!The Pub/Sub configuration variables follow the same pattern as the MQTT configuration variables, with sensible defaults for local development.
446-461: Pub/Sub lifecycle targets look good overall.The targets follow a similar lifecycle pattern to the MQTT targets. One minor observation: the
pubsub/inittarget relies onpython3and thegoogle-cloud-pubsubpackage being available in the local environment, which is documented in the README. The emulator image (gcr.io/google.com/cloudsdktool/google-cloud-cli:emulators) is accessible and available.hack/init-pubsub-emulator.py (4)
1-11: LGTM!The script is well-documented with clear purpose and environment variable usage.
19-66: LGTM!The server topics and subscriptions initialization logic is well-structured with proper error handling. The broad
Exceptioncatches (flagged by static analysis) are appropriate here since this is an initialization script that needs to handle various potential API errors from the Pub/Sub client library, and the code properly logs the error and returnsFalseto signal failure.
69-117: LGTM!The agent subscription initialization follows the same pattern as the server initialization with consistent error handling.
120-147: LGTM!The main function properly handles environment variables with sensible defaults and exits with appropriate codes on failure. The
emulator_hostvariable is correctly used for logging purposes (the library readsPUBSUB_EMULATOR_HOSTenvironment variable directly).templates/service-template.yml (1)
219-222: LGTM!The dynamic secret name
maestro-${MESSAGE_DRIVER_TYPE}enables the template to work with different message drivers. Making the secretoptional: trueis appropriate since not all message driver configurations require a secret (e.g., gRPC may use different authentication mechanisms).templates/service-tls-template.yml (1)
275-278: LGTM!The change is consistent with the non-TLS service template, enabling dynamic message driver configuration.
templates/README.md (1)
33-56: Documentation references are accurate; both GCP templates exist.The referenced GCP templates (
service-template-gcp.ymlandagent-template-gcp.yml) are present in the repository. The Pub/Sub emulator documentation is comprehensive and correct..github/workflows/e2e.yml (1)
119-145: LGTM! Pub/Sub e2e job follows existing patterns.The new
e2e-pubsubjob is well-structured and consistent with the existing e2e jobs. The environment variables (MESSAGE_DRIVER_TYPE=pubsub, SERVER_REPLICAS=2, ENABLE_MAESTRO_TLS=true) are appropriate for testing Pub/Sub message driver integration.templates/agent-template.yml (2)
72-85: LGTM! Pub/Sub parameters properly defined.The new parameters (PUBSUB_HOST, PUBSUB_PORT, PUBSUB_PROJECT_ID) follow the existing pattern for message driver configuration and have sensible defaults for the emulator environment.
367-381: Clarify that this template is for development/testing with the Pub/Sub emulator, not production.The
insecure: trueflag is appropriate for this template since it's designed to work with the Pub/Sub emulator (which runs on localhost:8085 without TLS). However, the template documentation should make clear thatagent-template.ymlis for local development and e2e testing. For production GCP deployments, useagent-template-gcp.ymlinstead, which does not include theinsecureflag and is intended for actual GCP Pub/Sub endpoints.test/setup/deploy_agent.sh (2)
49-51: LGTM! Pub/Sub configuration variables properly initialized.The variables are appropriately scoped and use consistent naming with the template parameters.
55-78: [rewritten review comment]
[classification tag]templates/agent-tls-template.yml (2)
72-85: LGTM! Pub/Sub parameters consistent with agent-template.yml.The parameters are correctly defined and match the non-TLS variant, maintaining consistency across agent templates.
383-397: The template is explicitly designed for Pub/Sub emulator testing and development, not production deployments. Per the templates README, production GCP deployments use the separateagent-template-gcp.ymltemplate. Theinsecure: trueflag is appropriate for this template's intended scope. No changes needed.Likely an incorrect or invalid review comment.
templates/pubsub-template.yml (1)
102-116: The Pub/Sub topic and subscription naming is already consistent across all templates and initialization scripts. No inconsistencies were found.
6a0704e to
8bbc798
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
hack/init-pubsub-emulator.py (1)
35-37: Consider more specific exception handling.The static analysis tool flags catching bare
Exceptionas a code smell. For production code, catching specific exception types (e.g.,exceptions.GoogleAPICallError) would be preferable. However, for an initialization script that logs errors and exits, the current approach is pragmatic and acceptable.If you want to be more specific, consider:
🔎 More specific exception handling
except exceptions.AlreadyExists: print(f" - Topic already exists: {topic_name}") - except Exception as e: + except exceptions.GoogleAPICallError as e: print(f" ✗ Error creating topic {topic_name}: {e}", file=sys.stderr) return FalseThis pattern would apply to lines 62, 113, and 145 as well. However, the current implementation is acceptable for an init script.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (17)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdgo.modhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (9)
- Makefile
- README.md
- test/setup/deploy_agent.sh
- go.mod
- templates/agent-tls-template.yml
- templates/README.md
- templates/agent-template.yml
- templates/pubsub-agent-init-job-template.yml
- templates/service-tls-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: e2e
- GitHub Check: upgrade
- GitHub Check: e2e-grpc-broker
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e-with-istio
- GitHub Check: e2e-pubsub
🔇 Additional comments (9)
.gitignore (1)
50-50: ✓ Looks good!The new ignore rule for
secrets/pubsub.configfollows the existing pattern for secret files and correctly prevents Pub/Sub configuration from being accidentally committed to the repository.test/e2e/pkg/cert_rotation_test.go (2)
123-125: LGTM! Correctly addresses the past review comment.The implementation now uses an early return in
AfterAllwhen the tests are skipped, which is the idiomatic pattern for conditional cleanup in Ginkgo lifecycle hooks. TheSkip()call inBeforeAll(line 43) is appropriate for skipping test specs.
34-52: Well-designed skip mechanism for Pub/Sub compatibility.The conditional skip logic properly handles the case where certificate rotation tests are not applicable (e.g., when using Pub/Sub emulator). The implementation:
- Checks for both MQTT and gRPC CA secrets
- Skips only when both are absent
- Maintains proper error handling for unexpected failures
hack/init-pubsub-emulator.py (1)
19-147: Well-structured initialization script.The script is clearly organized with:
- Separate functions for server and agent initialization
- Proper idempotency via
AlreadyExistsexception handling- Clear success/failure reporting with exit codes
- Environment variable configuration with sensible defaults
test/setup/env_setup.sh (1)
193-222: LGTM! Pub/Sub setup follows established patterns.The Pub/Sub emulator setup is well-integrated and follows the same pattern as the existing MQTT and gRPC broker setup:
- Deploy the emulator
- Wait for availability
- Initialize topics/subscriptions via a templated job
- Wait for job completion
- Clean up the job
The implementation is consistent with the rest of the script.
.github/workflows/e2e.yml (1)
119-145: LGTM! E2E workflow job for Pub/Sub is well-configured.The new
e2e-pubsubjob is structured consistently with the existinge2e-grpc-brokerjob and properly configures:
MESSAGE_DRIVER_TYPE: pubsubto enable Pub/Sub testingSERVER_REPLICAS: 2for multi-instance testingENABLE_MAESTRO_TLS: truefor secure communicationcontainer_tool: dockerfor the CI environmenttemplates/pubsub-template.yml (1)
1-116: LGTM! Well-designed Pub/Sub emulator template.The template properly defines the Pub/Sub emulator infrastructure:
- Service exposing port 8085
- Deployment running the emulator with appropriate configuration
- Secret containing server-side configuration (topics to publish, subscriptions to consume)
The use of
emptyDirfor storage (lines 98-99) is appropriate for an emulator in testing scenarios. Based on past review discussions, the:emulatorsimage tag (line 22) is intentionally chosen for stability, and the topic configuration aligns with the broader Pub/Sub integration across the repository.templates/service-template.yml (1)
219-222: The dynamic secret reference and optional flag are correctly implemented.The
maestro-${MESSAGE_DRIVER_TYPE}secret configuration withoptional: trueis appropriate. Themaestro-pubsubsecret is properly created by pubsub-template.yml (lines 103-120), and similarly,maestro-mqttis created by mqtt-template.yml. The optional flag prevents deployment failure when a particular message driver is not deployed.Note: The MESSAGE_DRIVER_TYPE parameter description in service-template.yml should be updated from "mqtt or grpc" to include "pubsub" now that Pub/Sub is a supported driver type.
templates/pubsub-init-job-template.yml (1)
52-110: google-cloud-pubsub 2.34.0 is stable and suitable for this use case.Version 2.34.0 was released December 16, 2025, and is tested for Python 3.7–3.14. It is an actively maintained, stable release. The operations used in this script (creating topics and subscriptions with filters) are fully supported by the Pub/Sub emulator. While the emulator has known limitations with UpdateTopic/UpdateSnapshot RPC and some advanced features, they do not affect the basic topic and subscription creation logic here.
The subscription filter
attributes.ce-originalsource="maestro"and the overall topic/subscription initialization approach are correct.
8bbc798 to
5b0276f
Compare
|
/assign @clyang82 |
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (17)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdgo.modhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (8)
- templates/service-tls-template.yml
- templates/README.md
- templates/pubsub-template.yml
- .gitignore
- templates/pubsub-init-job-template.yml
- Makefile
- README.md
- templates/agent-tls-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: e2e-pubsub
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e-grpc-broker
- GitHub Check: e2e-with-istio
- GitHub Check: upgrade
- GitHub Check: e2e
🔇 Additional comments (8)
test/setup/env_setup.sh (1)
193-222: LGTM! Pub/Sub initialization flow is well-structured.The Pub/Sub setup follows a clear pattern: deploy emulator → wait for readiness → initialize topics/subscriptions → cleanup. The hardcoded 5-second sleep at Line 205 is acceptable for test setup, though a readiness probe would be more robust in production.
test/e2e/pkg/cert_rotation_test.go (2)
32-51: LGTM! Skip logic correctly handles Pub/Sub scenario.The certificate rotation tests are appropriately skipped when neither MQTT nor gRPC CA secrets are present, which is the expected state for Pub/Sub emulator testing. The error handling distinguishes between NotFound (expected for Pub/Sub) and actual errors.
123-125: Correctly uses early return in AfterAll.The skip guard properly uses an early return instead of
Skip(), which aligns with Ginkgo best practices for cleanup hooks. This addresses the concern from the previous review..github/workflows/e2e.yml (1)
119-145: LGTM! CI job for Pub/Sub is properly configured.The new e2e-pubsub job follows the established pattern of other e2e jobs and appropriately sets MESSAGE_DRIVER_TYPE to pubsub. Testing with 2 replicas and TLS enabled provides good coverage.
templates/agent-template.yml (2)
72-85: LGTM! Pub/Sub parameters are well-defined.The new Pub/Sub configuration parameters have appropriate defaults for emulator testing and are properly marked as required.
367-381: LGTM! Agent Pub/Sub configuration is correct.The Secret properly configures Pub/Sub topics and subscriptions with consumer-specific subscription names. The
insecure: trueflag is appropriate for emulator usage in testing.test/setup/deploy_agent.sh (1)
49-78: LGTM! Agent Pub/Sub initialization is well-implemented.The agent subscription initialization flow properly creates consumer-specific subscriptions before deploying the agent. The job lifecycle management (create → wait → cleanup) follows best practices.
templates/pubsub-agent-init-job-template.yml (1)
58-120: LGTM! Agent subscription initialization logic is robust.The inline Python script properly creates consumer-specific subscriptions with correct filter syntax (ce-clustername). The use of google-cloud-pubsub 2.34.0 (as addressed in the previous review) ensures up-to-date dependencies. Error handling gracefully manages AlreadyExists while propagating unexpected errors.
5b0276f to
7e3be17
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Fix all issues with AI Agents 🤖
In @test/setup/env_setup.sh:
- Around line 208-215: The test setup uses the OpenShift CLI via the oc process
invocation (seen around the pubsub init block and in deploy_agent.sh for agent
subscriptions) but env_setup.sh never installs or verifies oc like it does for
kind, step, and istioctl; update env_setup.sh to either install the OpenShift
CLI (oc) into the CI image or add a preflight check that verifies oc is on PATH
and exits with a clear error if missing, and mirror the same check/installation
where deploy_agent.sh relies on oc so Pub/Sub initialization won't fail when
enabled.
🧹 Nitpick comments (4)
hack/init-pubsub-emulator.py (1)
35-37: Consider catching more specific exceptions.The broad
Exceptioncatch works for this CLI script, but catchinggoogle.api_core.exceptions.GoogleAPIErrorwould be more precise and avoid masking unexpected programming errors.🔎 Proposed fix
- except Exception as e: + except exceptions.GoogleAPIError as e: print(f" ✗ Error creating topic {topic_name}: {e}", file=sys.stderr) return FalseApply the same pattern at lines 62-64 and 113-115.
test/setup/env_setup.sh (1)
203-206: Consider replacingsleep 5with a readiness probe or retry loop.The fixed sleep is fragile and may cause flaky tests in slower environments. Consider using a retry loop to verify the emulator is responsive before proceeding.
🔎 Proposed improvement
# Initialize topics and subscriptions in the emulator - # Wait a bit for the emulator to be fully ready - sleep 5 + # Wait for the emulator to be fully ready + for i in {1..10}; do + if curl -s "http://${pubsub_host}:${pubsub_port}" >/dev/null 2>&1; then + break + fi + sleep 1 + donetemplates/pubsub-init-job-template.yml (2)
36-48: Consider adding Job failure constraints for robustness.The Job spec doesn't define
backoffLimit(defaults to 6) oractiveDeadlineSeconds. Adding explicit constraints improves observability and prevents the Job from hanging indefinitely if the Pub/Sub emulator is unreachable or slow to respond.🔎 Suggested Job constraints
spec: + backoffLimit: 3 + activeDeadlineSeconds: 300 template: spec: restartPolicy: NeverThis limits retries to 3 attempts and enforces a 5-minute timeout for the entire initialization process.
41-48: Add resource limits for production deployments.The container doesn't specify resource requests or limits. For production environments, defining these prevents resource contention and ensures predictable scheduling.
🔎 Suggested resource constraints
containers: - name: init image: registry.access.redhat.com/ubi9/python-311 imagePullPolicy: IfNotPresent + resources: + requests: + memory: "128Mi" + cpu: "100m" + limits: + memory: "256Mi" + cpu: "500m" env: - name: PUBSUB_EMULATOR_HOST value: "${PUBSUB_HOST}:${PUBSUB_PORT}"These values are conservative for a short-lived init Job that performs pip installation and lightweight API calls.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (17)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdgo.modhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (9)
- templates/service-template.yml
- test/e2e/pkg/cert_rotation_test.go
- .github/workflows/e2e.yml
- README.md
- Makefile
- templates/pubsub-agent-init-job-template.yml
- go.mod
- .gitignore
- templates/pubsub-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e-pubsub
- GitHub Check: upgrade
- GitHub Check: e2e
- GitHub Check: e2e-with-istio
- GitHub Check: e2e-grpc-broker
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
🔇 Additional comments (11)
templates/README.md (1)
33-55: LGTM!The documentation clearly describes the Pub/Sub emulator templates, the topics and subscriptions created by the init jobs, and references the GCP-specific templates for production deployments.
hack/init-pubsub-emulator.py (3)
1-17: LGTM!Clear module docstring with environment variable documentation and appropriate imports.
69-117: LGTM!The agent subscription initialization logic is correct, properly creating filtered and unfiltered subscriptions for the specified consumer.
120-147: LGTM!The main function correctly orchestrates initialization with sensible defaults. The
PUBSUB_EMULATOR_HOSTenvironment variable is used directly by the Google Cloud Pub/Sub client library, so reading it for logging purposes is appropriate.templates/service-tls-template.yml (1)
275-278: LGTM!The dynamic secret name
maestro-${MESSAGE_DRIVER_TYPE}correctly enables support for multiple message driver types while maintaining the optional flag for backward compatibility.templates/agent-template.yml (2)
72-86: LGTM!The Pub/Sub parameters are well-defined with sensible defaults that align with the emulator configuration in other scripts.
367-381: LGTM!The Pub/Sub secret configuration is correctly structured with proper GCP Pub/Sub resource paths and appropriate settings for the emulator (TLS disabled).
test/setup/deploy_agent.sh (2)
49-51: LGTM!The Pub/Sub environment variables are consistent with the values defined in
env_setup.sh.
55-78: LGTM!The Pub/Sub agent initialization block correctly creates consumer-specific subscriptions before deploying the agent. The workflow mirrors the server-side initialization pattern.
templates/agent-tls-template.yml (2)
72-86: LGTM!The Pub/Sub parameters are consistent with
agent-template.yml, ensuring both TLS and non-TLS deployments use the same configuration.
383-397: LGTM!The Pub/Sub secret configuration is consistent with
agent-template.yml, maintaining parity between the TLS and non-TLS agent templates.
7e3be17 to
42a9caf
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Fix all issues with AI Agents 🤖
In @test/setup/env_setup.sh:
- Around line 227-231: After waiting on job/pubsub-init, query its status with
kubectl get job pubsub-init in the ${namespace} and check the JSON fields
.status.succeeded and .status.failed (or .status.conditions) to ensure the init
job actually succeeded; if .status.succeeded is not >0 or .status.failed is >0,
log an error and exit non-zero without running the cleanup delete, otherwise
proceed to delete the job; update the script around the kubectl wait/delete
lines to perform this check and gate the kubectl delete on a successful job
status.
- Line 22: The oc_version variable in env_setup.sh is set to an outdated value;
update the oc_version assignment (oc_version="4.14.0") to the current stable
OpenShift CLI version (4.20.0 or 4.20) so scripts use the newer release; locate
the oc_version declaration in the file and change its value accordingly and run
any associated CI/setup scripts to verify compatibility.
🧹 Nitpick comments (2)
test/setup/env_setup.sh (1)
214-215: Consider using a readiness check instead of a fixed sleep.The 5-second sleep may not be sufficient in all environments. Consider polling the emulator's health endpoint or using
kubectl waitwith a readiness condition.🔎 Alternative approach using kubectl wait or health check
- # Initialize topics and subscriptions in the emulator - # Wait a bit for the emulator to be fully ready - sleep 5 + # Wait for emulator to be fully ready by checking its health + echo "Waiting for Pub/Sub emulator to be ready..." + until kubectl -n ${namespace} exec deploy/maestro-pubsub -- curl -sf http://localhost:8085/v1/projects/${pubsub_project_id}/topics >/dev/null 2>&1; do + echo "Emulator not ready yet, waiting..." + sleep 2 + done + echo "Emulator is ready"templates/pubsub-template.yml (1)
91-93: Optional: Consider removing redundant environment variable.The
PUBSUB_PROJECT_IDenvironment variable duplicates the project ID already passed via the--projectflag in the command. Unless it's consumed by the emulator or debugging scripts, it can be removed.🔎 Cleanup diff
ports: - containerPort: 8085 name: pubsub - env: - - name: PUBSUB_PROJECT_ID - value: ${PUBSUB_PROJECT_ID} volumeMounts: - name: pubsub-persistent-storage mountPath: /data
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (17)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdgo.modhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (6)
- templates/agent-template.yml
- templates/README.md
- .github/workflows/e2e.yml
- templates/pubsub-init-job-template.yml
- go.mod
- templates/pubsub-agent-init-job-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: e2e-pubsub
- GitHub Check: upgrade
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e-grpc-broker
- GitHub Check: e2e
- GitHub Check: e2e-with-istio
🔇 Additional comments (25)
test/e2e/pkg/cert_rotation_test.go (2)
32-51: LGTM! Proper skip logic for Pub/Sub environments.The implementation correctly skips certificate rotation tests when neither MQTT nor gRPC CA secrets exist (Pub/Sub doesn't use client certificates). The logic properly:
- Checks both CA secrets independently
- Only skips when both are missing
- Preserves error handling for non-NotFound errors
- Uses Skip() in BeforeAll (the correct lifecycle hook for skipping specs)
123-125: LGTM! Correct cleanup skip pattern.The early return when
skipis true properly prevents restoration logic from running when the test was skipped. This is the idiomatic pattern for conditional cleanup in Ginkgo lifecycle hooks, and addresses the past review comment correctly.README.md (3)
45-79: LGTM! Clear Pub/Sub setup documentation.The documentation effectively presents the dual-path setup (MQTT vs Pub/Sub) with clear step labels (3a/3b). The note about the google-cloud-pubsub Python package prerequisite is helpful for users.
114-121: LGTM! Clear Pub/Sub run instructions.The documentation clearly shows how to run Maestro with Pub/Sub, including the required flags and config file reference.
360-365: LGTM! Clear test environment Pub/Sub support.The documentation effectively shows how to enable Pub/Sub in the KinD test environment using the MESSAGE_DRIVER_TYPE environment variable.
.gitignore (1)
50-50: LGTM! Appropriate ignore rule for Pub/Sub configuration.The addition of
secrets/pubsub.configaligns with the existing pattern for ignoring secret files and prevents accidental commits of Pub/Sub emulator configuration.templates/service-template.yml (1)
219-222: LGTM! Correct dynamic secret reference pattern.The change to use
maestro-${MESSAGE_DRIVER_TYPE}enables multi-broker support (MQTT, gRPC, Pub/Sub). Theoptional: trueflag is the correct approach, as clarified in past review discussions: it allows gRPC deployments (where no secret is created) while mqtt/pubsub deployments will fail fast during initialization if the required secret is missing.templates/service-tls-template.yml (1)
275-278: LGTM! Consistent dynamic secret reference pattern.This change mirrors the approach in service-template.yml, using
maestro-${MESSAGE_DRIVER_TYPE}to enable multi-broker support. The pattern is correct and consistent across both templates.test/setup/env_setup.sh (1)
55-62: LGTM! OpenShift CLI installation follows established pattern.The installation block correctly checks for
oc, downloads from the official mirror, extracts, installs to/usr/local/bin, and cleans up properly. This addresses the previous concern aboutocavailability.templates/pubsub-template.yml (2)
80-87: LGTM! Emulator command is correctly configured.The emulator start command properly binds to
0.0.0.0:8085to accept connections from other pods and passes the project ID via the--projectflag.
107-116: LGTM! Secret structure aligns with agent configuration.The config.yaml properly defines the project ID, endpoint, and topic/subscription mappings that match the agent secret structure introduced in
templates/agent-tls-template.yml. ThedisableTLS: truesetting is appropriate for the emulator.test/setup/deploy_agent.sh (1)
49-51: LGTM! Pub/Sub environment variables are consistent.The exported variables match the values used in
test/setup/env_setup.shand align with the template defaults, ensuring consistency across server and agent setup.templates/agent-tls-template.yml (2)
72-85: LGTM! Pub/Sub parameters follow established conventions.The parameter definitions are consistent with the MQTT parameters above and use appropriate default values that match the emulator configuration.
383-397: LGTM! Agent Pub/Sub secret structure is correct.The secret properly configures consumer-specific subscriptions (
sourceevents-${CONSUMER_NAME},sourcebroadcast-${CONSUMER_NAME}) and shared topics (agentevents,agentbroadcast), enabling proper message routing between server and agent.hack/init-pubsub-emulator.py (4)
1-11: LGTM! Script header and documentation are clear.The module docstring properly explains the script's purpose and documents the expected environment variables.
19-66: LGTM! Server initialization logic is robust.The function correctly creates topics and subscriptions with appropriate filters for CloudEvents attributes. The broad exception handling is appropriate here to catch any Pub/Sub API errors and provide useful feedback.
69-117: LGTM! Agent subscription initialization is correct.The function properly creates consumer-specific subscriptions with appropriate filtering. The broadcast subscription correctly has no filter to receive all broadcast messages.
120-147: LGTM! Main function orchestrates initialization correctly.The function properly sequences server initialization before agent initialization, handles environment variables with sensible defaults, and ensures proper exit codes for CI/CD integration.
Makefile (7)
65-69: LGTM!The Pub/Sub configuration variables follow the established patterns for MQTT broker configuration, with appropriate defaults for local development and testing.
88-89: LGTM!The documentation correctly reflects the addition of Pub/Sub as a message driver option.
139-145: LGTM!The help output additions clearly document the new Pub/Sub targets and follow the established format.
331-333: LGTM!The Pub/Sub parameters are correctly passed to the template generation, following the same pattern as the MQTT configuration.
446-450: LGTM!The Pub/Sub emulator setup follows the established patterns for database and MQTT broker setup. The port mapping and container configuration are correct.
452-455: LGTM!The teardown target correctly stops and removes the Pub/Sub emulator container, consistent with the cleanup patterns for other services.
457-460: Python prerequisites are already documented in the README.The
google-cloud-pubsubrequirement is documented at README line 77: "Requires google-cloud-pubsub Python package (pip3 install google-cloud-pubsub)". The README also shows the correct workflow sequence (make pubsub/setupfollowed bymake pubsub/init). No action needed; the current documentation is sufficient.
42a9caf to
a7fbefa
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
test/setup/env_setup.sh (1)
203-240: Well-structured Pub/Sub initialization with proper error handling.The initialization flow properly validates job success before cleanup (lines 231-236), which addresses the previous review concern. The deployment wait, job creation, and cleanup sequence is well-organized.
However, the arbitrary 5-second sleep (line 215) before initialization is fragile. Consider replacing it with an active readiness check.
🔎 Proposed fix to replace sleep with readiness check
- # Initialize topics and subscriptions in the emulator - # Wait a bit for the emulator to be fully ready - sleep 5 + # Wait for emulator to be ready to accept requests + echo "Waiting for Pub/Sub emulator to be ready..." + for i in {1..30}; do + if kubectl -n ${namespace} exec deploy/maestro-pubsub -- curl -s http://localhost:8085 > /dev/null 2>&1; then + echo "Pub/Sub emulator is ready" + break + fi + if [ $i -eq 30 ]; then + echo "ERROR: Pub/Sub emulator failed to become ready" >&2 + exit 1 + fi + sleep 1 + donehack/init-pubsub-emulator.py (1)
19-66: Consider more specific exception handling.The function properly propagates errors by returning
Falseand logging to stderr, which is then checked inmain(). However, catching bareException(lines 35-37, 62-64) is overly broad. Consider catching specific exceptions likegoogle.api_core.exceptions.GoogleAPIErroror letting unexpected exceptions propagate naturally.🔎 Proposed refactor for more specific exception handling
def init_server_topics_and_subscriptions(project_id: str): """Initialize topics and subscriptions for the Maestro server.""" publisher = pubsub_v1.PublisherClient() subscriber = pubsub_v1.SubscriberClient() # Topics to create topics = ['sourceevents', 'sourcebroadcast', 'agentevents', 'agentbroadcast'] print("Creating topics...") for topic_name in topics: topic_path = publisher.topic_path(project_id, topic_name) try: publisher.create_topic(request={"name": topic_path}) print(f" ✓ Created topic: {topic_name}") except exceptions.AlreadyExists: print(f" - Topic already exists: {topic_name}") - except Exception as e: + except exceptions.GoogleAPIError as e: print(f" ✗ Error creating topic {topic_name}: {e}", file=sys.stderr) return FalseApply similar changes to subscription creation and agent initialization.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (17)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdgo.modhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (7)
- templates/agent-template.yml
- test/e2e/pkg/cert_rotation_test.go
- templates/pubsub-agent-init-job-template.yml
- test/setup/deploy_agent.sh
- templates/README.md
- README.md
- .gitignore
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: e2e
- GitHub Check: e2e-pubsub
- GitHub Check: e2e-grpc-broker
- GitHub Check: upgrade
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e-with-istio
🔇 Additional comments (8)
go.mod (1)
60-60: LGTM! Clean dependency resolution after upstream merge.The SDK dependency has been correctly updated to the official upstream version following the merge of PR #182. This removes the temporary fork dependency and uses the proper release channel.
templates/agent-tls-template.yml (2)
72-85: LGTM! Pub/Sub parameters follow consistent patterns.The new Pub/Sub parameters are well-defined with appropriate defaults for the emulator environment. The
required: trueflag with sensible defaults ensures the template can be used out-of-the-box for development while allowing production overrides.
383-397: LGTM! Pub/Sub secret configuration aligns with emulator usage.The
disableTLS: truesetting is appropriate here since the default endpoint (maestro-pubsub.maestro:8085) points to the Pub/Sub emulator, which doesn't support TLS. The topic and subscription naming patterns are consistent with the MQTT configuration structure, and the per-consumer subscription naming (sourceevents-${CONSUMER_NAME}) provides proper isolation..github/workflows/e2e.yml (1)
119-145: LGTM! Pub/Sub e2e job follows established patterns.The new
e2e-pubsubjob is well-structured and mirrors the configuration of existing e2e jobs. The environment variables (2 replicas, TLS enabled, Pub/Sub driver) provide appropriate coverage for testing the Pub/Sub message driver in CI alongside MQTT and gRPC variants.test/setup/env_setup.sh (1)
55-62: LGTM! OpenShift CLI installation properly implemented.The installation follows the same pattern as other tools (kind, step), downloads from the official mirror, and includes proper cleanup. This addresses the previous concern about
ocavailability for Pub/Sub initialization.templates/pubsub-template.yml (1)
1-116: LGTM! Well-structured Pub/Sub emulator template.The template properly defines all necessary resources (Service, Deployment, Secret) with appropriate configuration for the Google Cloud Pub/Sub emulator. The use of
emptyDirfor storage (lines 98-99) is appropriate for an emulator in testing scenarios where persistence across pod restarts is not required.The image tag
:emulators(line 22) has been confirmed as a deliberate choice for stability based on previous discussion.templates/pubsub-init-job-template.yml (1)
52-112: LGTM! Pub/Sub initialization Job properly configured with error handling.The inline Python script correctly:
- Creates all required topics (sourceevents, sourcebroadcast, agentevents, agentbroadcast)
- Creates subscriptions with appropriate filters
- Propagates exceptions on failure (lines 82, 109), ensuring the Job fails when initialization fails
- Uses the updated google-cloud-pubsub version (2.34.0)
The subscription filter
attributes.ce-originalsource="maestro"(line 86) has been confirmed as correct based on previous discussion—it matches events published by the agent via the SDK.Makefile (1)
65-69: LGTM! Pub/Sub support well-integrated following existing patterns.The Makefile changes consistently integrate Pub/Sub support:
- Configuration variables (lines 65-69) follow the same naming convention as MQTT variables
- Help documentation (lines 143-145) clearly describes the new targets
- Template parameter passing (lines 331-333) mirrors the MQTT parameter pattern
- Lifecycle targets (lines 446-460) match the structure of existing
mqtt/setup,mqtt/teardowntargetsThe
pubsub/inittarget (lines 457-460) properly invokes the Python initialization script with the correct environment variables.Also applies to: 143-145, 331-333, 446-460
a7fbefa to
3e7f4ac
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
test/setup/env_setup.sh (1)
203-240: LGTM! Pub/Sub initialization flow is well-structured with proper error handling.The implementation correctly:
- Deploys the Pub/Sub emulator and waits for availability
- Uses
oc processto create the initialization job from the template- Verifies job success by checking
.status.succeededbefore cleanup- Provides clear error messages and logs on failure
- Cleans up the initialization job after successful completion
This addresses the past review concerns about job status verification.
💡 Optional: Replace hardcoded sleep with readiness probe
Line 215 uses
sleep 5to wait for the emulator to be fully ready. Consider replacing this with a proper readiness check:- # Wait a bit for the emulator to be fully ready - sleep 5 + # Wait for the emulator to be fully ready + kubectl -n ${namespace} wait --for=condition=ready pod -l app=maestro-pubsub --timeout=60sThis would be more reliable and potentially faster than a fixed sleep duration.
test/setup/deploy_agent.sh (1)
61-69: Use exported variables instead of hardcoded values for consistency.Lines 65-67 hardcode the Pub/Sub configuration values, but lines 49-51 already export these as variables. If the exported variables are modified, the
oc processcommand won't pick up those changes.♻️ Refactor to use exported variables
# Create agent-specific subscriptions using the template oc process \ --filename="${PWD}/templates/pubsub-agent-init-job-template.yml" \ --local="true" \ --param="NAMESPACE=${agent_namespace}" \ - --param="PUBSUB_HOST=maestro-pubsub.${namespace}" \ - --param="PUBSUB_PORT=8085" \ - --param="PUBSUB_PROJECT_ID=maestro-test" \ + --param="PUBSUB_HOST=${pubsub_host}" \ + --param="PUBSUB_PORT=${pubsub_port}" \ + --param="PUBSUB_PROJECT_ID=${pubsub_project_id}" \ --param="CONSUMER_NAME=${consumer_name}" \ | kubectl apply -f -
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (16)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (6)
- templates/README.md
- README.md
- templates/agent-tls-template.yml
- templates/pubsub-template.yml
- .github/workflows/e2e.yml
- templates/agent-template.yml
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: e2e-pubsub
- GitHub Check: upgrade
- GitHub Check: e2e-with-istio
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e
- GitHub Check: e2e-grpc-broker
🔇 Additional comments (11)
.gitignore (1)
50-50: LGTM! Pub/Sub configuration exclusion is appropriate.The addition of
secrets/pubsub.configfollows the established pattern for secret files and aligns with the Pub/Sub integration introduced in this PR.templates/service-template.yml (1)
219-222: LGTM! Dynamic secret naming correctly supports multiple message drivers.The change from
maestro-mqtttomaestro-${MESSAGE_DRIVER_TYPE}enables support for MQTT, gRPC, and Pub/Sub drivers. Theoptional: trueflag is correctly retained to support the gRPC case where no secret is created (as confirmed in the past review discussion).templates/service-tls-template.yml (1)
275-278: LGTM! Consistent dynamic secret handling across TLS template.The dynamic secret naming change aligns perfectly with
service-template.yml, ensuring consistent message driver support across both TLS and non-TLS deployments.test/e2e/pkg/cert_rotation_test.go (2)
32-52: LGTM! Skip logic correctly handles certificate-less Pub/Sub deployments.The implementation properly:
- Checks for both MQTT and gRPC CA secrets, skipping only when neither exists (Pub/Sub case)
- Distinguishes
NotFounderrors from actual errors that should fail the test- Includes clear comments explaining that Pub/Sub emulator doesn't use client certificates
- Correctly positions the check in
BeforeAllbefore any certificate operations
123-125: LGTM! Early return pattern correctly implemented.The
AfterAllhook now uses an early return instead ofSkip(), which is the correct pattern for conditional cleanup in Ginkgo lifecycle hooks, as confirmed in the past review discussion.test/setup/env_setup.sh (1)
55-62: LGTM! OpenShift CLI installation ensuresocavailability.The installation block properly handles the missing
ocdependency, downloads the correct version, and cleans up temporary files. This addresses the past review concern aboutocavailability for theoc processcommands used in Pub/Sub initialization.templates/pubsub-agent-init-job-template.yml (1)
1-120: LGTM! Agent initialization template is well-structured.The template correctly:
- Uses the latest google-cloud-pubsub version (2.34.0)
- Creates consumer-specific subscriptions with appropriate filters
- Handles idempotent initialization with
AlreadyExistsexception handling- Properly propagates errors for non-idempotent failures
hack/init-pubsub-emulator.py (1)
1-147: LGTM! Initialization script follows appropriate error handling patterns.The script correctly orchestrates Pub/Sub resource initialization with proper error handling. The static analysis warnings about broad
Exceptioncatches (lines 35, 62, 113, 145) are false positives in this context:
- Initialization scripts must handle all potential Google Cloud API exceptions
- Errors are logged before returning
Falseor raisingmain()checks return values and exits with status 1 on failure- Top-level exception handler (line 145) catches unexpected errors
This pattern is appropriate for initialization jobs where any failure should fail fast.
Makefile (2)
65-69: LGTM! Pub/Sub configuration variables follow existing patterns.The new Pub/Sub variables are consistent with the existing MQTT broker configuration structure and properly integrated into the template rendering workflow.
446-461: LGTM! Pub/Sub lifecycle targets follow established conventions.The
pubsub/setup,pubsub/teardown, andpubsub/inittargets mirror the existingdb/andmqtt/patterns, providing a consistent interface for managing Pub/Sub emulator lifecycle.templates/pubsub-init-job-template.yml (1)
1-112: LGTM! Server initialization template is well-structured.The template correctly:
- Uses the latest google-cloud-pubsub version (2.34.0)
- Creates all required topics for both server and agent communication
- Creates server-side subscriptions with appropriate filters
- Properly propagates errors via
raisestatements (lines 82, 109)- Handles idempotent initialization with
AlreadyExistsexception handling
3e7f4ac to
355c8c8
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/setup/deploy_agent.sh (1)
61-69: Use exported variables instead of hardcoded values.The
oc processcommand hardcodes values on lines 65-67 that duplicate the environment variables exported on lines 49-51. This violates the DRY principle and makes the code harder to maintain—if these values need to change, both locations must be updated.♻️ Refactor to use exported variables
# Create agent-specific subscriptions using the template oc process \ --filename="${PWD}/templates/pubsub-agent-init-job-template.yml" \ --local="true" \ --param="NAMESPACE=${agent_namespace}" \ - --param="PUBSUB_HOST=maestro-pubsub.${namespace}" \ - --param="PUBSUB_PORT=8085" \ - --param="PUBSUB_PROJECT_ID=maestro-test" \ + --param="PUBSUB_HOST=${pubsub_host}" \ + --param="PUBSUB_PORT=${pubsub_port}" \ + --param="PUBSUB_PROJECT_ID=${pubsub_project_id}" \ --param="CONSUMER_NAME=${consumer_name}" \ | kubectl apply -f -
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (16)
.github/workflows/e2e.yml.gitignoreMakefileREADME.mdhack/init-pubsub-emulator.pytemplates/README.mdtemplates/agent-template.ymltemplates/agent-tls-template.ymltemplates/pubsub-agent-init-job-template.ymltemplates/pubsub-init-job-template.ymltemplates/pubsub-template.ymltemplates/service-template.ymltemplates/service-tls-template.ymltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/env_setup.sh
🚧 Files skipped from review as they are similar to previous changes (9)
- .github/workflows/e2e.yml
- templates/pubsub-init-job-template.yml
- templates/agent-template.yml
- templates/service-tls-template.yml
- README.md
- templates/agent-tls-template.yml
- templates/pubsub-template.yml
- Makefile
- .gitignore
🧰 Additional context used
🪛 Ruff (0.14.10)
hack/init-pubsub-emulator.py
35-35: Do not catch blind exception: Exception
(BLE001)
62-62: Do not catch blind exception: Exception
(BLE001)
113-113: Do not catch blind exception: Exception
(BLE001)
145-145: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: e2e
- GitHub Check: e2e-pubsub
- GitHub Check: e2e-with-istio
- GitHub Check: e2e-grpc-broker
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: upgrade
🔇 Additional comments (9)
templates/service-template.yml (1)
219-222: LGTM! Dynamic secret naming correctly implemented.The dynamic secret reference (
maestro-${MESSAGE_DRIVER_TYPE}) withoptional: trueis the correct approach to support all broker types (mqtt, pubsub, grpc). For grpc deployments where no secret is created, the optional flag allows the pod to start. For mqtt/pubsub deployments, the secret will be created and validated at application startup.templates/pubsub-agent-init-job-template.yml (1)
1-120: LGTM! Well-structured Pub/Sub agent initialization template.The template is well-designed:
- Required parameters are properly marked
- The Python script handles both filtered (
sourceevents-{consumer_name}) and unfiltered (sourcebroadcast-{consumer_name}) subscriptions- Error handling includes
AlreadyExistsfor idempotency and general exceptions with proper error messages- The CloudEvents filter syntax (
attributes.ce-clustername="{consumer_name}") is correcthack/init-pubsub-emulator.py (1)
1-147: LGTM! Appropriate error handling for initialization script.The script is well-structured with clear separation of server and agent initialization logic. The bare
Exceptioncatches flagged by static analysis (lines 35, 62, 113, 145) are appropriate here:
- They provide defensive error handling for an initialization script where any unexpected error should be caught, logged, and cause the script to fail gracefully
- Each catch block logs the error with context before returning
Falseor exiting- The script properly distinguishes
AlreadyExists(idempotent, informational) from genuine errors (failure)This pattern is correct for a setup/initialization script that should never leave the environment in a partially-configured state.
test/e2e/pkg/cert_rotation_test.go (2)
32-51: LGTM! Correct skip mechanism for certificate rotation tests.The skip logic properly handles scenarios where certificate-based brokers (MQTT/gRPC) are not deployed:
- Checks for both CA secrets before proceeding
- Distinguishes
NotFounderrors (skip case) from actual errors (fail)- Sets the skip flag and calls
Skip()to mark the test as skipped- The comment at lines 36-37 clearly explains why this is needed (Pub/Sub doesn't use client certificates)
123-125: LGTM! Correct cleanup skip pattern.The early return in
AfterAllwhenskip==trueis the correct pattern. This avoids attempting to restore certificates and clean up resources that were never created when the test was skipped.test/setup/env_setup.sh (2)
55-62: LGTM! OpenShift CLI installation added.The
ocCLI installation logic follows the same pattern as the other tools (kind, step, istioctl) and correctly downloads from the official OpenShift mirror for the specified version.
203-240: LGTM! Well-structured Pub/Sub emulator initialization.The Pub/Sub setup flow is comprehensive and robust:
- Deploys the emulator and waits for availability
- Includes a brief sleep for full readiness
- Uses
oc processto render the initialization template with appropriate parameters- Waits for job completion with a reasonable timeout (120s)
- Validates job success before cleanup (lines 231-236), logging errors and exiting non-zero on failure
- Cleans up the init job after successful completion
This addresses all previous review concerns and follows best practices for initialization jobs.
templates/README.md (1)
33-56: LGTM! Clear and comprehensive documentation.The documentation for the Pub/Sub emulator and initialization templates is well-structured and provides helpful details about topics, subscriptions, and filters. The distinction between development/testing templates and production GCP templates is clear.
test/setup/deploy_agent.sh (1)
71-86: Excellent job completion verification and error handling.The implementation properly waits for job completion, verifies success before cleanup, and provides helpful error messages with logs when failures occur. This addresses the concern from the previous review and follows best practices for Kubernetes job orchestration.
Signed-off-by: Morven Cao <lcao@redhat.com>
355c8c8 to
5a4a443
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @charts/maestro-agent/templates/pubsub-init-hook.yaml:
- Around line 56-60: The function extract_topic_name is dead code and should be
removed; delete its entire definition (the def extract_topic_name(...) block)
from the script and run a quick search for any references to extract_topic_name
to confirm none exist and that removing it introduces no usage break; if there
are related unused imports or comments tied specifically to that function,
remove those as well.
🧹 Nitpick comments (4)
charts/maestro-server/templates/pubsub-emulator.yaml (1)
1-62: Consider adding resource limits and health probes for the emulator deployment.The Pub/Sub emulator deployment is missing resource limits and liveness/readiness probes. While this is intended for development/testing, adding these would improve stability:
- Resource limits - prevent the emulator from consuming excessive resources
- Health probes - ensure Helm waits for the emulator to be ready before dependent services start
Note: The YAMLlint syntax error on line 1 is a false positive — Helm conditionals (
{{- if ... }}) are valid Helm template syntax.♻️ Suggested improvements
- name: pubsub-emulator image: {{ .Values.pubsubEmulator.image }} imagePullPolicy: IfNotPresent + resources: + requests: + cpu: 100m + memory: 256Mi + limits: + cpu: 500m + memory: 512Mi command: - gcloud - beta - emulators - pubsub - start - --host-port=0.0.0.0:8085 - --project={{ .Values.pubsubEmulator.projectID }} ports: - containerPort: 8085 name: pubsub protocol: TCP + readinessProbe: + tcpSocket: + port: 8085 + initialDelaySeconds: 5 + periodSeconds: 10 + livenessProbe: + tcpSocket: + port: 8085 + initialDelaySeconds: 10 + periodSeconds: 30charts/maestro-agent/templates/pubsub-init-hook.yaml (1)
37-39: Runtime pip install adds latency and external network dependency.Installing
google-cloud-pubsubat runtime during each hook execution introduces latency and depends on external network availability. For a more robust solution, consider using a pre-built image with the library already installed or caching the pip packages.This is acceptable for development/testing but may be fragile if PyPI is unreachable.
charts/maestro-server/templates/pubsub-init-hook.yaml (1)
34-93: Consider pinning the Python image digest for reproducibility.The script logic is well-structured with proper idempotency handling via
AlreadyExistsexceptions. One minor note: runtimepip installworks for test environments but could be fragile if PyPI has issues. Consider building a custom init image with dependencies pre-installed if this becomes flaky.test/e2e/pkg/cert_rotation_test.go (1)
360-362: Consider using a random serial number for certificates.Using
time.Now().Unix()for the serial number could produce collisions if multiple certificates are generated in the same second. While unlikely to cause issues in this test context, best practice is to use a cryptographically random serial:♻️ Suggested improvement
- SerialNumber: big.NewInt(time.Now().Unix()), + SerialNumber: func() *big.Int { + serialNumber, _ := rand.Int(rand.Reader, new(big.Int).Lsh(big.NewInt(1), 128)) + return serialNumber + }(),
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (11)
.github/workflows/e2e.ymlcharts/maestro-agent/templates/pubsub-init-hook.yamlcharts/maestro-agent/templates/secret.yamlcharts/maestro-agent/values.yamlcharts/maestro-server/templates/pubsub-emulator.yamlcharts/maestro-server/templates/pubsub-init-hook.yamlcharts/maestro-server/templates/pubsub.yamlcharts/maestro-server/values.yamltest/e2e/pkg/cert_rotation_test.gotest/setup/deploy_agent.shtest/setup/deploy_server.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/e2e.yml
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/maestro-agent/templates/pubsub-init-hook.yaml
[error] 1-1: syntax error: expected the node content, but found '-'
(syntax)
charts/maestro-server/templates/pubsub-emulator.yaml
[error] 1-1: syntax error: expected the node content, but found '-'
(syntax)
charts/maestro-server/templates/pubsub-init-hook.yaml
[error] 1-1: syntax error: expected the node content, but found '-'
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: verify
- GitHub Check: e2e
- GitHub Check: e2e-grpc-broker
- GitHub Check: e2e-pubsub
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: e2e-with-istio
- GitHub Check: upgrade
🔇 Additional comments (13)
charts/maestro-server/values.yaml (2)
88-89: LGTM! New Pub/Sub configuration fields are well-structured.The
endpointanddisableTLSfields follow the existing configuration patterns and have sensible defaults (empty endpoint, TLS enabled by default).
99-106: LGTM! Pub/Sub emulator configuration is appropriately structured.The emulator is disabled by default (safe for production), and the configuration includes all necessary fields for deployment. Consider adding inline comments documenting that this is for development/testing only, similar to the PostgreSQL and Mosquitto sections.
charts/maestro-agent/templates/pubsub-init-hook.yaml (1)
79-101: LGTM! Subscription creation with proper error handling.The subscription creation logic correctly handles the
AlreadyExistsexception gracefully while re-raising unexpected errors. The filter expression forsourceeventssubscription properly routes messages based once-clusternameattribute.charts/maestro-agent/values.yaml (1)
61-62: LGTM! Pub/Sub configuration fields align with server-side changes.The new
endpointanddisableTLSfields mirror the server-side configuration incharts/maestro-server/values.yaml, ensuring consistency across the deployment.charts/maestro-agent/templates/secret.yaml (1)
65-68: LGTM! Conditional endpoint and explicit disableTLS handling.The conditional inclusion of
endpoint(only when set) and unconditional emission ofdisableTLSfollows the same pattern as the server-sidepubsub.yamltemplate, ensuring consistency across both deployments.charts/maestro-server/templates/pubsub.yaml (1)
14-17: LGTM! Endpoint and TLS configuration additions are consistent.The conditional
endpointand unconditionaldisableTLSfields mirror the agent-side configuration incharts/maestro-agent/templates/secret.yaml, ensuring consistent Pub/Sub configuration across server and agent deployments.test/setup/deploy_agent.sh (1)
113-127: LGTM! Pub/Sub test configuration is well-aligned with emulator setup.The configuration correctly:
- Uses
maestro-testproject ID matching the emulator default invalues.yaml- Points to the emulator service endpoint on port 8085
- Disables TLS (appropriate for the emulator which doesn't support TLS)
- Uses consumer-specific subscription names (
sourceevents-${consumer_name},sourcebroadcast-${consumer_name}) that align with the init hook's subscription creation logiccharts/maestro-server/templates/pubsub-init-hook.yaml (1)
1-12: LGTM on Helm hook configuration.The Helm hook annotations are correctly configured. The YAMLlint error on line 1 is a false positive—
{{- if ... }}is valid Helm templating syntax that YAMLlint doesn't understand.test/e2e/pkg/cert_rotation_test.go (2)
36-52: Well-designed skip logic for Pub/Sub compatibility.The conditional skip logic correctly identifies when certificate rotation tests are inapplicable (Pub/Sub mode). The error handling properly distinguishes
NotFoundfrom unexpected errors, ensuring tests fail fast on real issues while gracefully skipping for Pub/Sub.
123-127: LGTM on AfterAll guard.The early return when
skipis true correctly prevents cleanup attempts for resources that were never initialized.test/setup/deploy_server.sh (3)
119-128: LGTM on Pub/Sub message broker configuration.The Pub/Sub configuration is well-structured with appropriate values for the test environment. The topic and subscription paths follow the correct format (
projects/{project}/topics/{topic}).
203-210: Configuration consistency verified.The emulator service configuration (
maestro-pubsub:8085) correctly matches the endpoint specified in themessageBroker.pubsubsection. Good alignment.
221-222: Clarify the need for forced rollout restart.The
helm upgrade --install ... --waiton line 218 should already wait for the deployment to be ready. The explicitrollout restartfollowed byrollout statusadds ~30+ seconds to deployment time and forces pod recreation.Is this working around a specific timing issue (e.g., Pub/Sub init hook completing after maestro pods start)? If so, consider adding a comment explaining the rationale. If the init hook is the concern, the hook's negative weight (
-5) should cause it to run before other post-install hooks but after the deployment is created.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
charts/maestro-agent/templates/pubsub-init-hook.yaml (2)
20-32: Consider adding resource limits and backoffLimit.The container lacks resource requests/limits, and the Job doesn't specify
backoffLimit. While defaults may suffice for e2e testing, explicit configuration improves reliability and prevents resource contention.♻️ Suggested improvement
spec: restartPolicy: Never + backoffLimit: 3 containers: - name: init image: registry.access.redhat.com/ubi9/python-311 imagePullPolicy: IfNotPresent + resources: + requests: + cpu: 100m + memory: 128Mi + limits: + cpu: 200m + memory: 256Mi env:
36-42: Runtime pip install introduces network dependency.Installing
google-cloud-pubsubat Job runtime requires network access to PyPI on every run. This is pragmatic for e2e testing, but be aware the Job will fail if PyPI is unreachable. A pre-built image with dependencies baked in would be more robust for production use cases.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
charts/maestro-agent/templates/pubsub-init-hook.yaml
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/maestro-agent/templates/pubsub-init-hook.yaml
[error] 1-1: syntax error: expected the node content, but found '-'
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Red Hat Konflux / maestro-on-pull-request
- GitHub Check: Red Hat Konflux / maestro-e2e-on-pull-request
- GitHub Check: e2e-broadcast-subscription
- GitHub Check: upgrade
- GitHub Check: e2e-grpc-broker
- GitHub Check: e2e-pubsub
- GitHub Check: e2e-with-istio
- GitHub Check: e2e
🔇 Additional comments (2)
charts/maestro-agent/templates/pubsub-init-hook.yaml (2)
1-13: LGTM - Helm hook configuration is appropriate.The static analysis error from YAMLlint is a false positive—YAMLlint doesn't understand Helm template syntax (
{{-). The hook annotations correctly configure this Job to run before install/upgrade with appropriate cleanup policies.
50-92: LGTM - Subscription creation logic is correct and idempotent.The Python script properly handles the
AlreadyExistsexception for idempotent behavior during upgrades. The filter syntax forsourceeventssubscription is correct. Note that this assumes topics (sourceevents,sourcebroadcast) are already created by the server-side init hook.
test/setup/deploy_server.sh
Outdated
| --timeout 5m | ||
|
|
||
| kubectl wait deploy/maestro -n $namespace --for condition=Available=True --timeout=300s | ||
| kubectl rollout restart deploy/maestro -n ${namespace} |
There was a problem hiding this comment.
Why do you need to restart maestro?
There was a problem hiding this comment.
when deploying the maestro server via a helm chart, the pubsub-init job is executed as a post-hook. this means maestro server may start before the pubsub-init job completes, causing cloudevents subscriptions to not be started in time.
There was a problem hiding this comment.
ok. let us restart only for pubsub.
aa2cf85 to
2b7d2b3
Compare
Signed-off-by: Morven Cao <lcao@redhat.com>
2b7d2b3 to
5336f1b
Compare
Signed-off-by: Morven Cao <lcao@redhat.com>
depends on: open-cluster-management-io/sdk-go#182