-
Notifications
You must be signed in to change notification settings - Fork 26
integrated sap hana cdc in the data warehouse guide #3390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughA new real-time CDC integration section was added to the data warehouses guide, introducing SAP HANA change-data-capture with rationale, prerequisites, step-by-step setup, architecture overview, operational guidance, and troubleshooting. The guide also now lists multiple ingestion patterns and points to the CDC section for real-time needs. Changes
Sequence Diagram(s)sequenceDiagram
participant SAP as SAP HANA
participant Triggers as CDC Triggers
participant CDC_Tables as CDC Tables
participant Moose as Moose Workflow
participant Temporal as Temporal
participant CH as ClickHouse
participant Monitor as Monitoring
SAP->>Triggers: Write change events (INSERT/UPDATE/DELETE)
Triggers->>CDC_Tables: Persist change records
CDC_Tables->>Moose: Poll/stream CDC records
Moose->>Temporal: Enqueue processing tasks
Temporal->>Moose: Execute workflow steps
Moose->>CH: Apply changes (upsert/prune)
Moose->>Monitor: Emit metrics/logs
CH->>Monitor: Expose ingestion metrics
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/framework-docs-v2/content/guides/data-warehouses.mdx (1)
1-5: Add required frontmatter (title/description).
This guide is missing the mandatory frontmatter forcontent/guides/*.mdx.As per coding guidelines, guides must include frontmatter with `title` and `description`.✅ Suggested fix
+--- +title: "Building Your First Data Warehouse" +description: "Build a ClickHouse-based analytics warehouse with batch loading and optional SAP HANA CDC." +--- + # Building Your First Data Warehouse
🤖 Fix all issues with AI agents
In `@apps/framework-docs-v2/content/guides/data-warehouses.mdx`:
- Around line 1148-1156: Replace the unhyphenated compound modifiers in the
text: change "60 second interval" to "60-second interval" and "7 day default" to
"7-day default" so compound adjectives are hyphenated correctly; locate the
phrases "60 second interval" and "7 day default" in the Ongoing Sync / Resource
Usage bullet list and update them accordingly.
- Around line 992-1009: The Python code block containing class Ekko and its
__moose_config__ OlapTable declaration needs the `@test` directive so the
snippet is validated; update the fenced code block opening from ```python to
```python `@test` (the block that starts with "from moose_lib import OlapTable,
Key" and defines class Ekko and __moose_config__) so the documentation test
harness will run this snippet.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
apps/framework-docs-v2/content/guides/data-warehouses.mdx
🧰 Additional context used
📓 Path-based instructions (2)
apps/framework-docs-v2/content/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
apps/framework-docs-v2/content/**/*.mdx: Use{{ include "shared/path.mdx" }}directives to reuse content fragments, which are processed viaprocessIncludes()during build
Validate code snippets in documentation with the@testdirective for TypeScript and Python code blocks
TypeScript code snippets in documentation should be validated for syntax with brace matching; Python snippets should be validated for indentation
Files:
apps/framework-docs-v2/content/guides/data-warehouses.mdx
apps/framework-docs-v2/content/guides/**/*.mdx
📄 CodeRabbit inference engine (apps/framework-docs-v2/CLAUDE.md)
Guide MDX files in
content/guides/must include frontmatter with title and description fields
Files:
apps/framework-docs-v2/content/guides/data-warehouses.mdx
🪛 LanguageTool
apps/framework-docs-v2/content/guides/data-warehouses.mdx
[grammar] ~1149-~1149: Use a hyphen to join words.
Context: ...Ongoing Sync**: - Sub-minute latency (60 second interval) - Scales to thousands o...
(QB_NEW_EN_HYPHEN)
[grammar] ~1155-~1155: Use a hyphen to join words.
Context: ...size - Pruning keeps CDC tables small (7 day default) - Redis memory: ~1MB per 10...
(QB_NEW_EN_HYPHEN)
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
The SAP HANA CDC implementation uses database triggers and doesn't require Redis for state tracking. Updated prerequisites to reflect actual CREATE TABLE and CREATE TRIGGER permissions needed, rather than misleading "CDC permissions" reference. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Updated documentation to use the new, more intuitive command flags: - --generate-models (instead of --recreate-moose-models) - --create-database-triggers (instead of --init-cdc) - --init-all (new quick start option to run both steps) Added Quick Start tip showing how to run both model generation and trigger creation in a single command. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
aa2602c to
000e2de
Compare
This pull request significantly expands the data ingestion documentation for Moose by introducing comprehensive guidance on real-time Change Data Capture (CDC) from SAP HANA, alongside detailed explanations of CDC concepts, architectural patterns, and practical setup steps. The changes help users understand when and how to use CDC versus traditional batch loading, and provide hands-on instructions for implementing real-time data pipelines.
Major documentation improvements:
New CDC ingestion guide and setup instructions:
CDC concepts and architecture:
Decision guidance and performance:
These changes make the documentation much more actionable for users needing real-time data synchronization and deepen the conceptual understanding of CDC patterns.
Note
Expands the data ingestion docs with a full real-time CDC path from SAP HANA to ClickHouse and adds CDC concepts to the architecture section.
Option 3: Real-Time CDC from SAP HANA" with prerequisites, installation via 514 Labs registry, env config, model generation, trigger setup, pipeline run/monitoring, verification, troubleshooting, and performance notesWritten by Cursor Bugbot for commit 5f2586e. This will update automatically on new commits. Configure here.