-
Notifications
You must be signed in to change notification settings - Fork 47
Shadowing in Cloud - single source #1498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughThis pull request updates documentation to support Cloud API integration for Redpanda's disaster recovery shadowing feature. The changes include switching the Antora playbook to build from a feature branch (DOC-1621-Document-Cloud-Feature-Shadowing-Disaster-Recovery-Enterprise) and expanding four shadowing documentation files with cloud-specific workflows. The additions introduce Data Plane API and Control Plane API examples, authentication headers, conditional blocks gated by Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
local-antora-playbook.yml (1)
20-20: Add inline documentation and plan reversion for temporary feature branch.The branch change is intentional for cloud documentation preview/testing, but lacks context for future maintainers. Additionally, feature branches can be deleted, causing build failures after the cloud-docs PR merges.
Recommendation 1: Add a comment above line 20 documenting the temporary nature and reversion plan:
- url: https://github.com/redpanda-data/cloud-docs + # Temporary: Using feature branch for cloud API shadowing docs preview. + # Revert to 'main' after cloud-docs PR #462 merges. - branches: main + branches: 'DOC-1621-Document-Cloud-Feature-Shadowing-Disaster-Recovery-Enterprise'Recommendation 2: Verify that the branch name
DOC-1621-Document-Cloud-Feature-Shadowing-Disaster-Recovery-Enterpriseexists in the cloud-docs repository and matches the linked PR #462 feature branch. Consider tracking a follow-up issue to revert this change after the cloud-docs PR is merged to prevent build failures.modules/manage/pages/disaster-recovery/shadowing/failover-runbook.adoc (2)
17-17: Track TODO verification requirement.Line 17 contains a TODO noting that command output examples need verification in a test environment. This is important for documentation accuracy, especially for failover procedures where users depend on expected output formats. Ensure this is tracked and completed before release.
Would you like help creating a tracking issue or validation script for verifying command outputs in the test environment?
102-105: Ensure consistent API terminology across sections.The runbook mixes "Control Plane API" and "Cloud API" terminology. Lines 102-114 use "Control Plane API" for listing, but lines 199-202 retrieve a Data Plane URL without explicitly labeling the first curl as "Control Plane API". Similarly, lines 334-336 label as "DELETE" but the tab header says "Cloud API". Standardize terminology throughout: either "Control Plane API" or "Cloud API" consistently.
Also applies to: 107-110, 145-157, 334-336
modules/manage/pages/disaster-recovery/shadowing/setup.adoc (1)
648-648: Consider adding Cloud API reference at documentation conclusion.Line 648 references the Admin API v2, which is appropriate for non-cloud environments. Consider adding a Cloud API equivalent reference for cloud-enabled users in a conditional block to maintain documentation completeness.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
local-antora-playbook.yml(1 hunks)modules/manage/pages/disaster-recovery/shadowing/failover-runbook.adoc(5 hunks)modules/manage/pages/disaster-recovery/shadowing/failover.adoc(2 hunks)modules/manage/pages/disaster-recovery/shadowing/monitor.adoc(1 hunks)modules/manage/pages/disaster-recovery/shadowing/setup.adoc(14 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-25T21:00:26.626Z
Learnt from: micheleRP
Repo: redpanda-data/docs PR: 1334
File: modules/manage/partials/rbac-dp.adoc:93-98
Timestamp: 2025-08-25T21:00:26.626Z
Learning: In cloud documentation (env-cloud), Security is at the top level navigation, so ACL references should use `security:authorization/rbac/acl.adoc`. In self-managed documentation, Security is nested under Manage, so ACL references use `manage:security/authorization/acl.adoc`. The different xref paths in conditional blocks reflect these different navigation structures.
Applied to files:
modules/manage/pages/disaster-recovery/shadowing/setup.adoc
🔇 Additional comments (10)
modules/manage/pages/disaster-recovery/shadowing/monitor.adoc (2)
39-41: Verify API versioning and clarify shadow-link identifier requirements.The curl examples use
/v1/shadow-linksendpoints, but the PR objectives note that the Cloud API currently references v1beta2 with v1 expected on Dec 12. Verify whether these examples should reference v1beta2 or if the versioning is environment-specific. Additionally, clarify whether<shadow-link-id>and<shadow-link-name>are interchangeable or distinct identifiers, as the rpk examples use names while the API examples use IDs.Also applies to: 68-71
78-81: Improved presentation of status command.Good improvement wrapping the
rpk shadow statuscommand in a bash code block for consistency with the tab structure and better readability.modules/manage/pages/disaster-recovery/shadowing/failover.adoc (2)
16-22: Cloud vs non-cloud messaging is clear and well-structured.The conditional messaging appropriately distinguishes between cloud environments (Cloud UI, Data Plane API) and non-cloud environments (Console, Admin API), providing clear context for users.
80-91: Clarify Data Plane API path escaping and verify request structure.Line 80 documents the endpoint path with
\{shadow_link_name}(escaped braces). Verify this is the correct Antora escaping for rendering the unescaped path in documentation. Additionally, confirm the POST request body structure ("name"+ optional"shadowTopicName") matches the Data Plane API specification.modules/manage/pages/disaster-recovery/shadowing/setup.adoc (6)
75-76: Excellent alignment of xref paths for cloud vs non-cloud environments.The conditional xref paths correctly use
security:authorization/acl.adocfor cloud environments andmanage:security/authorization/acl.adocfor non-cloud, matching the navigation structure differences noted in the learnings. This pattern is correctly applied throughout.Also applies to: 80-81, 100-101
243-271: Clarify Cloud API secret reference syntax and verify configuration.Line 271 uses
${secrets.<sasl-password-secret-name>}syntax for referencing secrets created in the source cluster. Verify this is the correct Cloud API syntax for secret interpolation and that it matches the Cloud Control Plane API specification. Additionally, ensure the secret creation requirement (line 243) is clearly discoverable in the referenced documentation.
255-300: Verify POST request body structure and API versioning.The Control Plane API POST request to
/v1/shadow-linksuses snake_case field names with nested structure. Verify this structure matches the current Cloud Control Plane API specification. The PR objectives note that the API is expected to transition from v1beta2 to v1 on Dec 12; confirm whether these examples should reference v1beta2 or if the versioning is handled automatically.
152-226: Comprehensive filter documentation with clear examples.The expanded filter section (lines 328-467) provides excellent clarity on pattern types, filter processing rules, and common use cases. The examples for topic, consumer group, and ACL filtering are well-structured and would help users configure shadow links effectively.
511-598: Well-structured networking and bootstrap configuration section.The networking sections (lines 511-598) provide clear guidance on connection requirements, firewall configuration, bootstrap servers, and security settings. The detailed YAML examples with comments make this actionable for users.
272-273: Fix API reference links.
- Line 313 and similar instances: Change
xref:manage:api/cloud-byoc-controlplane-api.adoc#lrotoxref:redpanda-cloud:manage:api/cloud-byoc-controlplane-api.adoc#lro— the API doc is in the redpanda-cloud module, not manage.- Line 315 and similar instances: Complete the incomplete link
link:/api/doc/cloud-controlplane/v1/operation/operation-[Control Plane API reference]by adding the appropriate operation ID (e.g.,operation-shadowlinkservice_createshadowlinkor the correct endpoint).⛔ Skipped due to learnings
Learnt from: micheleRP Repo: redpanda-data/docs PR: 1349 File: modules/manage/pages/cluster-maintenance/manage-throughput.adoc:0-0 Timestamp: 2025-09-03T16:34:58.323Z Learning: For Redpanda documentation, use absolute URLs (https://docs.redpanda.com/api/...) rather than relative URLs (/api/...) when linking to API documentation. Relative API links break in Netlify previews because Bump only serves from docs.redpanda.com, causing the relative URLs to be appended to the preview URL where Bump doesn't serve content.Learnt from: micheleRP Repo: redpanda-data/docs PR: 1334 File: modules/manage/partials/rbac-dp.adoc:93-98 Timestamp: 2025-08-25T21:00:26.626Z Learning: In cloud documentation (env-cloud), Security is at the top level navigation, so ACL references should use `security:authorization/rbac/acl.adoc`. In self-managed documentation, Security is nested under Manage, so ACL references use `manage:security/authorization/acl.adoc`. The different xref paths in conditional blocks reflect these different navigation structures.Learnt from: CR Repo: redpanda-data/docs PR: 0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-11-25T09:42:15.235Z Learning: Applies to docs-data/property-overrides.json : Always use full Antora resource IDs with module prefixes in xref links within property descriptions (e.g., `reference:properties/cluster-properties.adoc`, never `./cluster-properties.adoc`)Learnt from: Feediver1 Repo: redpanda-data/docs PR: 1153 File: modules/reference/pages/properties/topic-properties.adoc:45-50 Timestamp: 2025-07-16T19:33:20.420Z Learning: In the Redpanda documentation, topic property cross-references like <<max.compaction.lag.ms>> and <<min.compaction.lag.ms>> require corresponding property definition sections with anchors like [[maxcompactionlagms]] and [[mincompactionlagms]] to prevent broken links.Learnt from: CR Repo: redpanda-data/docs PR: 0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-11-25T09:42:15.235Z Learning: Applies to docs-data/property-overrides.json : Normalize all xref links in property-overrides.json to use full Antora resource IDs after updatingLearnt from: CR Repo: redpanda-data/docs PR: 0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-11-25T09:42:15.235Z Learning: Applies to docs-data/property-overrides.json : Prefix self-managed-only links with `self-managed-only:` in related_topics to handle documentation pages that only exist in self-managed deployments
modules/manage/pages/disaster-recovery/shadowing/failover-runbook.adoc
Outdated
Show resolved
Hide resolved
modules/manage/pages/disaster-recovery/shadowing/failover-runbook.adoc
Outdated
Show resolved
Hide resolved
8416408 to
d894b50
Compare
|
This is not true. What you can't do is write to a shadow topic.
|
|
In here let's also include the Dataplane API endpoints https://github.com/redpanda-data/console/blob/master/proto/redpanda/api/dataplane/v1alpha3/shadowlink.proto#L189-L231
|
modules/manage/pages/disaster-recovery/shadowing/failover-runbook.adoc
Outdated
Show resolved
Hide resolved
modules/manage/pages/disaster-recovery/shadowing/failover-runbook.adoc
Outdated
Show resolved
Hide resolved
47b399c to
be98351
Compare
| include::manage:disaster-recovery/shadowing/monitor.adoc[tag=rpk-tab-health-checks] | ||
| -- | ||
| Cloud API:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c-julin It seems that rpk shadow status provides lag info grouped by partition whereas $DATAPLANE_API_URL/v1/shadowlinks/<shadow-link-name>/topic just has a total_lag field and that looks like it's aggregated for all shadow topics. Is that correct? What should the API commands look like if I want to do the equivalent of these rpk commands?
# Check all shadow links are active
rpk shadow list | grep -v "ACTIVE" || echo "All shadow links healthy"
# Monitor lag for critical topics
rpk shadow status <shadow-link-name> | grep -E "LAG|Lag"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total lag is total lag per topic, we removed lag per partition info recently to mirror admin api the only way to caclulate lag per partition is to calculate it from sources hwm - hwm.
| * `<destination-redpanda-cluster-id>`: ID of the shadow (destination) cluster. | ||
| * `<shadow-link-name>`: Unique name for this shadow link, for example, `production-dr`. | ||
| * `<source-broker-1>:<port>`, `<source-broker-2>:<port> ...`: Source cluster brokers to connect to, for example, `prod-kafka-1.example.com:9092`, `prod-kafka-2.example.com:9092`. | ||
| * `<sasl-username>`: SASL/SCRAM username, for example, `shadow-replication-user`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say SASL/SCRAM username from the source cluster...?
micheleRP
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comments, but lgtm!
Data plane /shadowlinks/<shadow-link-name>/topic shows topic and lag status
Co-authored-by: Rogger Vasquez <[email protected]>
507d494 to
d8174ad
Compare






Description
Related PR adds Disaster Recovery / Shadowing docs in Cloud: redpanda-data/cloud-docs#462
Docs for UI were merged from this PR #1511
This pull request introduces extensive improvements to the disaster recovery documentation for Redpanda's shadowing feature, focusing on making procedures clearer and providing parallel instructions for both self-hosted (rpk CLI) and cloud environments (Cloud/Data Plane/Control Plane APIs). The changes add tabbed code blocks and environment-based conditionals to all major operational guides, ensuring users can easily follow the correct steps for their deployment type. Additionally, terminology and command references have been updated for accuracy and clarity.
Cloud vs. Self-hosted Operations Documentation:
failover-runbook.adoc,failover.adoc,monitor.adoc) to provide side-by-side instructions forrpkCLI and Cloud API/Control Plane API operations, including listing, describing, monitoring, failover, and deletion of shadow links and topics. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]Failover and Monitoring Enhancements:
API Reference and Command Accuracy:
curlcommands and links to official API documentation, ensuring users have correct endpoints and usage patterns. [1] [2] [3] [4] [5] [6] [7]Terminology and UX Improvements:
Configuration and Branch Updates:
local-antora-playbook.ymlto point to the correct documentation branch for cloud disaster recovery features.These changes significantly enhance the usability and clarity of the disaster recovery documentation, making it easier for both cloud and self-hosted users to manage shadowing and respond to cluster failures.
Resolves https://redpandadata.atlassian.net/browse/
Review deadline:
Page previews
Manage > Disaster Recovery >
Configure Shadowing
Monitor Shadowing
Failover
Failover Runbook
Checks