-
Notifications
You must be signed in to change notification settings - Fork 121
Add disable_samples column configuration flag #835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add disable_samples column configuration flag #835
Conversation
- Add is_sampling_disabled_for_column macro to check column config - Modify query_test_result_rows to skip sampling when disable_samples=true - Add integration tests for disable_samples functionality - Test prevents sampling, allows normal sampling, and overrides PII tags Co-Authored-By: Yosef Arbiv <[email protected]>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
WalkthroughA new integration test module and supporting macro logic were introduced to verify and enforce the "disable_samples" configuration for dbt test columns. The tests check that sample row collection is correctly enabled or disabled based on this setting, including when columns are tagged as PII. Macro logic now conditionally skips sampling queries when the configuration is set. Changes
Sequence Diagram(s)sequenceDiagram
participant TestRunner as Test Runner
participant DBT as dbt
participant Macro as query_test_result_rows
participant Config as is_sampling_disabled_for_column
TestRunner->>DBT: Run test with disable_samples config
DBT->>Macro: Call query_test_result_rows
Macro->>Config: Check is_sampling_disabled_for_column(flattened_test)
alt Sampling disabled
Config-->>Macro: true
Macro-->>DBT: Skip sample query, log debug
else Sampling enabled
Config-->>Macro: false
Macro->>DBT: Execute sample query
end
DBT-->>TestRunner: Return test results (with/without samples)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15 minutes Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🧰 Additional context used🧬 Code Graph Analysis (1)integration_tests/tests/test_disable_samples_config.py (2)
🔇 Additional comments (7)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
👋 @devin-ai-integration[bot] |
Add disable_samples column configuration flag
Summary
This PR implements a new
disable_samplesconfiguration flag that allows users to prevent sample collection for specific columns at the column level. Whendisable_samples: trueis set on a column inschema.yml, that column will not be sampled during test failures, regardless of any PII tags or other configuration.Key Changes:
is_sampling_disabled_for_column()macro to check column-level configurationquery_test_result_rows()to skip sampling whendisable_samples: trueis configuredUsage Example:
Review & Testing Checklist for Human
schema.ymlfile to ensure the graph node access pathparent_model.get('columns', {}).get(test_column_name, {}).get('config', {})is correctdisable_samples: trueand verify no samples are collected intest_result_rowstabledisable_samplesflagnot_nullRecommended Test Plan:
disable_samples: true, some without)Diagram
%%{ init : { "theme" : "default" }}%% graph TD Schema["schema.yml<br/>disable_samples: true"]:::context TestMat["macros/edr/materializations/<br/>test/test.sql"]:::major-edit QueryRows["query_test_result_rows()"]:::major-edit CheckDisabled["is_sampling_disabled_for_column()"]:::major-edit GraphNodes["dbt graph.nodes<br/>column config access"]:::context IntTests["integration_tests/tests/<br/>test_disable_samples_config.py"]:::major-edit TestResults["test_result_rows table<br/>(sample storage)"]:::context Schema --> GraphNodes TestMat --> QueryRows QueryRows --> CheckDisabled CheckDisabled --> GraphNodes QueryRows --> TestResults IntTests --> TestMat subgraph Legend L1[Major Edit]:::major-edit L2[Minor Edit]:::minor-edit L3[Context/No Edit]:::context end classDef major-edit fill:#90EE90 classDef minor-edit fill:#87CEEB classDef context fill:#FFFFFFNotes
Link to Devin run: https://app.devin.ai/sessions/3838f375f01b48338b9937dbd22776e3
Requested by: Yosef Arbiv (@arbiv)
Summary by CodeRabbit
New Features
Tests