-
Notifications
You must be signed in to change notification settings - Fork 203
Add support for upstream/downstream model selectors with graph operators #2056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for upstream/downstream model selectors with graph operators #2056
Conversation
- Implement graph operator detection (+model, model+, +model+) in SelectorFilter - Add _has_graph_operators() helper method to detect + symbols in selectors - Update from_cli_params() to support graph operators in models: filters - Use dbt ls command to resolve upstream/downstream dependencies via SelectorFetcher - Add comprehensive unit tests for graph operator functionality - Support both --select and --filters CLI approaches Fixes #2046 Co-Authored-By: Yosef Arbiv <[email protected]>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
👋 @devin-ai-integration[bot] |
WalkthroughSupport for graph-operator syntax in CLI filters was added. Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI handler
participant Schema as FiltersSchema.from_cli_params
participant Selector as SelectorFilter
participant Fetcher as selector fetcher
CLI->>Schema: from_cli_params(filters, excludes, config, tracking)
activate Schema
loop each filter/exclude
alt filter is model selector
Schema->>Selector: _has_graph_operators(selector)?
activate Selector
Selector-->>Schema: true / false
deactivate Selector
alt true (graph operators)
rect rgb(220,240,220)
Schema->>Fetcher: fetch node_names for selector (if fetcher present)
Fetcher-->>Schema: node_names[]
Schema->>Schema: append node_names to FiltersSchema
end
else false (no graph operators)
rect rgb(240,240,220)
Schema->>Schema: append standard FilterSchema entry
end
end
else other filter type
Schema->>Schema: append standard FilterSchema entry
end
end
Schema-->>CLI: FiltersSchema (includes node_names)
deactivate Schema
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
… excludes properly
- Remove unused MagicMock import from test_filters_schema.py (fixes flake8 F401)
- Change selector from 'model:{value}' to '{value}' to include tests in graph resolution
- Only resolve graph operators for FilterType.IS, not IS_NOT (excludes)
- Add test for exclude case with graph operators to ensure proper handling
- Run black formatting on test files
Co-Authored-By: Yosef Arbiv <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
elementary/monitor/data_monitoring/schema.py (1)
191-292: Graph-operator handling ignores filter type for models, breaking excludes semantics
FiltersSchema.from_cli_paramsnow routes allmodels:filters with graph operators through thenode_namespath wheneverconfigis provided:model_value = ( models_match[0] if len(models_match) == 1 else ",".join(models_match) ) if config and SelectorFilter._has_graph_operators(model_value): selector_filter = SelectorFilter( config, tracking, f"model:{model_value}" ) filter_result = selector_filter.get_filter() if filter_result.node_names: node_names.extend(filter_result.node_names) else: models.append(FilterSchema(values=models_match, type=filter_type))However,
node_namesis just aList[str], andapply()always treats it as a positiveFilterType.ISfilter:FilterSchema( values=self.node_names, type=FilterType.IS ).apply_filter_on_values(filter_fields.node_names)This means:
--filters models:customers+works as intended (positive include via dbt graph resolution).- But
--excludes models:customers+(wherefilter_type == FilterType.IS_NOT) is also turned into a positivenode_namesinclude filter, instead of an exclusion, becausefilter_typeis ignored on the graph-operator path.Previously, excludes on
models:would yieldFilterSchema(..., type=FilterType.IS_NOT)and behave as a true exclusion (even if+was just a literal). The new behavior is likely surprising for users combining--excludeswith graph operators.A minimal, backward-compatible fix is to only route positive
models:filters with graph operators throughnode_namesfor now, and keep all other cases (including excludes) on the existingmodelspath:- if models_match: - model_value = ( - models_match[0] - if len(models_match) == 1 - else ",".join(models_match) - ) - if config and SelectorFilter._has_graph_operators(model_value): - selector_filter = SelectorFilter( - config, tracking, f"model:{model_value}" - ) - filter_result = selector_filter.get_filter() - if filter_result.node_names: - node_names.extend(filter_result.node_names) - else: - models.append(FilterSchema(values=models_match, type=filter_type)) + if models_match: + model_value = ( + models_match[0] + if len(models_match) == 1 + else ",".join(models_match) + ) + # For now, resolve graph operators only for positive filters. + # Excludes with "+" fall back to literal model-name filters, + # preserving previous behavior until node-name excludes are designed. + if ( + config + and filter_type == FilterType.IS + and SelectorFilter._has_graph_operators(model_value) + ): + selector_filter = SelectorFilter( + config, tracking, f"model:{model_value}" + ) + filter_result = selector_filter.get_filter() + if filter_result.node_names: + node_names.extend(filter_result.node_names) + else: + models.append(FilterSchema(values=models_match, type=filter_type))You may also want to add tests in
test_filters_schema.pyfor:
--excludes models:customers+with a config: ensures it still behaves as a literal model exclusion.--filters models:customers+vs--filters models:customersto confirm that only the+case hitsnode_nameswhile the plain case stays onmodels.This will lock in the intended contract for includes vs excludes before the new feature ships.
🧹 Nitpick comments (1)
elementary/monitor/cli.py (1)
144-152: Consider mentioning graph-operator support in filters help textThe updated
--selecthelp forCommand.MONITORclearly marks it as deprecated in favor of--filters, but the--filtershelp (lines 271–272) still describes only simplemodels:<models separated by commas>.Consider amending the
--filtershelp string to hint that dbt-style graph operators are supported inmodels:filters, for example:
models:<models or dbt selectors (supports + graph operators)>This will make the new capability discoverable from
--helpwithout changing behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
elementary/monitor/cli.py(2 hunks)elementary/monitor/data_monitoring/schema.py(4 hunks)elementary/monitor/data_monitoring/selector_filter.py(2 hunks)tests/unit/monitor/data_monitoring/test_filters_schema.py(2 hunks)tests/unit/monitor/data_monitoring/test_selector_filter.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
elementary/monitor/data_monitoring/selector_filter.py (3)
elementary/tracking/tracking_interface.py (1)
set_env(29-30)elementary/monitor/api/selector/selector.py (1)
get_selector_results(15-17)elementary/monitor/data_monitoring/schema.py (2)
FiltersSchema(146-362)FilterSchema(62-126)
elementary/monitor/data_monitoring/schema.py (2)
elementary/monitor/cli.py (1)
monitor(298-407)elementary/monitor/data_monitoring/selector_filter.py (3)
SelectorFilter(22-179)_has_graph_operators(165-166)get_filter(161-162)
elementary/monitor/cli.py (1)
elementary/monitor/data_monitoring/schema.py (2)
FiltersSchema(146-362)from_cli_params(192-292)
tests/unit/monitor/data_monitoring/test_filters_schema.py (2)
elementary/monitor/data_monitoring/schema.py (3)
FiltersSchema(146-362)FilterType(33-37)from_cli_params(192-292)tests/mocks/config_mock.py (1)
MockConfig(8-14)
tests/unit/monitor/data_monitoring/test_selector_filter.py (1)
tests/mocks/config_mock.py (1)
MockConfig(8-14)
🪛 Flake8 (7.3.0)
tests/unit/monitor/data_monitoring/test_filters_schema.py
[error] 1-1: 'unittest.mock.MagicMock' imported but unused
(F401)
🪛 Ruff (0.14.5)
tests/unit/monitor/data_monitoring/test_selector_filter.py
271-271: Unused function argument: dbt_runner_with_models_mock
(ARG001)
289-289: Unused function argument: dbt_runner_with_models_mock
(ARG001)
307-307: Unused function argument: dbt_runner_with_models_mock
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test / test
- GitHub Check: code-quality
🔇 Additional comments (4)
tests/unit/monitor/data_monitoring/test_filters_schema.py (1)
319-380: Graph-operator CLI filter tests look correct and cover key pathsThe three new tests validate:
- Downstream (
models:customers+) and upstream (models:+customers) selectors correctly resolve tonode_nameswhenconfigand tracking are provided.- The no-config case falls back to literal model values, storing
"customers+"undermodelsand leavingnode_namesempty.This matches the intended behavior of
FiltersSchema.from_cli_paramsand provides good regression coverage for the new feature.tests/unit/monitor/data_monitoring/test_selector_filter.py (1)
324-329: _has_graph_operators test coverage is clear and sufficientThe
_has_graph_operatorsbehavior is fully specified here for downstream, upstream, both, and non-graph cases. This gives good confidence that selector routing decisions based on+will remain stable.elementary/monitor/cli.py (1)
369-371: Correctly wiring config and tracking into FiltersSchema.from_cli_paramsPassing
configandanonymous_trackingintoFiltersSchema.from_cli_paramsin themonitorcommand is exactly what’s needed for dbt-based resolution of model graph operators. This keeps the CLI surface unchanged while enabling the new behavior under the hood.elementary/monitor/data_monitoring/selector_filter.py (1)
102-120: Graph-operator handling in SelectorFilter aligns with dbt usageThe extended
model_matchbranch and_has_graph_operatorshelper cleanly distinguish between plainmodel:selectors and those using graph operators:
- When a dbt runner is available and
model_valuecontains"+", node names are resolved viaselector_fetcherand stored inFiltersSchema(node_names=..., selector=selector).- Otherwise, the selector falls back to a simple
modelsfilter as before.Given that
_can_use_fetcheralready routes pure dbt-select strings (likemodel:customers+) through the generic “dbt selector” path, this added logic usefully covers more complex selectors where dbt resolution should still apply at the model level.Also applies to: 164-167
| def test_parse_selector_with_graph_operators_downstream( | ||
| dbt_runner_with_models_mock, anonymous_tracking_mock | ||
| ): | ||
| config = MockConfig("mock_project_dir") | ||
|
|
||
| data_monitoring_filter = SelectorFilter( | ||
| tracking=anonymous_tracking_mock, | ||
| config=config, | ||
| selector="model:customers+", | ||
| ) | ||
|
|
||
| assert data_monitoring_filter.get_filter().node_names == [ | ||
| "node_name_1", | ||
| "node_name_2", | ||
| ] | ||
| assert data_monitoring_filter.get_filter().selector == "model:customers+" | ||
|
|
||
|
|
||
| def test_parse_selector_with_graph_operators_upstream( | ||
| dbt_runner_with_models_mock, anonymous_tracking_mock | ||
| ): | ||
| config = MockConfig("mock_project_dir") | ||
|
|
||
| data_monitoring_filter = SelectorFilter( | ||
| tracking=anonymous_tracking_mock, | ||
| config=config, | ||
| selector="model:+customers", | ||
| ) | ||
|
|
||
| assert data_monitoring_filter.get_filter().node_names == [ | ||
| "node_name_1", | ||
| "node_name_2", | ||
| ] | ||
| assert data_monitoring_filter.get_filter().selector == "model:+customers" | ||
|
|
||
|
|
||
| def test_parse_selector_with_graph_operators_both( | ||
| dbt_runner_with_models_mock, anonymous_tracking_mock | ||
| ): | ||
| config = MockConfig("mock_project_dir") | ||
|
|
||
| data_monitoring_filter = SelectorFilter( | ||
| tracking=anonymous_tracking_mock, | ||
| config=config, | ||
| selector="model:+customers+", | ||
| ) | ||
|
|
||
| assert data_monitoring_filter.get_filter().node_names == [ | ||
| "node_name_1", | ||
| "node_name_2", | ||
| ] | ||
| assert data_monitoring_filter.get_filter().selector == "model:+customers+" | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark dbt_runner_with_models_mock as used to satisfy Ruff (ARG001)
In the three graph-operator tests, dbt_runner_with_models_mock is required for fixture injection/patching but never referenced in the body, so Ruff flags it as an unused argument.
One simple pattern is to reference the argument at the top of each test:
-def test_parse_selector_with_graph_operators_downstream(
- dbt_runner_with_models_mock, anonymous_tracking_mock
-):
+def test_parse_selector_with_graph_operators_downstream(
+ dbt_runner_with_models_mock, anonymous_tracking_mock
+):
+ _ = dbt_runner_with_models_mock # ensure fixture is used for linters
@@
-def test_parse_selector_with_graph_operators_upstream(
- dbt_runner_with_models_mock, anonymous_tracking_mock
-):
+def test_parse_selector_with_graph_operators_upstream(
+ dbt_runner_with_models_mock, anonymous_tracking_mock
+):
+ _ = dbt_runner_with_models_mock # ensure fixture is used for linters
@@
-def test_parse_selector_with_graph_operators_both(
- dbt_runner_with_models_mock, anonymous_tracking_mock
-):
+def test_parse_selector_with_graph_operators_both(
+ dbt_runner_with_models_mock, anonymous_tracking_mock
+):
+ _ = dbt_runner_with_models_mock # ensure fixture is used for lintersAlternatively, you could rename the fixture parameter to _dbt_runner_with_models_mock and adjust the fixture definition name accordingly if you prefer that style.
🧰 Tools
🪛 Ruff (0.14.5)
271-271: Unused function argument: dbt_runner_with_models_mock
(ARG001)
289-289: Unused function argument: dbt_runner_with_models_mock
(ARG001)
307-307: Unused function argument: dbt_runner_with_models_mock
(ARG001)
🤖 Prompt for AI Agents
In tests/unit/monitor/data_monitoring/test_selector_filter.py around lines 270
to 323, the fixture parameter dbt_runner_with_models_mock is injected but not
referenced which triggers Ruff ARG001; fix by either referencing it at the top
of each test (e.g., assign it to a throwaway like _ =
dbt_runner_with_models_mock) or rename the parameter to
_dbt_runner_with_models_mock (and if renaming, update the fixture definition
name accordingly) so Ruff no longer treats it as an unused argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/unit/monitor/data_monitoring/test_filters_schema.py (1)
319-415: Consider adding tests for edge cases with multiple models and combined operators.While the current tests cover the main scenarios well, consider adding tests for:
- Multiple models with graph operators:
models:model1+,model2+- Combined upstream/downstream operators:
models:+model1+These edge cases would help ensure the comma-joining logic in schema.py (lines 235-239) and graph operator detection work correctly in all scenarios.
elementary/monitor/data_monitoring/schema.py (1)
198-198: Circular import workaround is acceptable but consider future refactoring.The inline import of
SelectorFilteravoids a circular dependency between schema.py and selector_filter.py. While this is a known code smell, it's a pragmatic solution for now.Consider these future improvements:
- Move
_has_graph_operatorsto a shared utilities module- Use dependency injection to pass a selector resolver function
- Restructure modules to eliminate the circular dependency
Based on learnings
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
elementary/monitor/data_monitoring/schema.py(4 hunks)tests/unit/monitor/data_monitoring/test_filters_schema.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/unit/monitor/data_monitoring/test_filters_schema.py (2)
tests/unit/monitor/data_monitoring/test_selector_filter.py (1)
anonymous_tracking_mock(248-249)tests/mocks/config_mock.py (1)
MockConfig(8-14)
elementary/monitor/data_monitoring/schema.py (2)
elementary/monitor/cli.py (1)
monitor(298-407)elementary/monitor/data_monitoring/selector_filter.py (3)
SelectorFilter(22-179)_has_graph_operators(165-166)get_filter(161-162)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test / test
🔇 Additional comments (9)
tests/unit/monitor/data_monitoring/test_filters_schema.py (5)
1-7: LGTM! Imports are clean and appropriate.The removal of unused
MagicMockand addition ofMockConfigandMockAnonymousTrackingalign with the new test requirements.
319-345: LGTM! Comprehensive test for downstream graph operators.The test correctly verifies that when graph operators are present with config, the
node_namesfield is populated instead ofmodels, which aligns with the expected behavior.
347-377: LGTM! Good coverage for upstream graph operators.This test complements the downstream test and ensures both operator directions work correctly.
379-393: LGTM! Important fallback test.This test ensures graceful degradation when config is unavailable—graph operators are treated as literal model names rather than failing, which is the correct fallback behavior.
395-415: LGTM! Correct handling of excludes with graph operators.The test correctly verifies that graph operators in excludes are not resolved to
node_names. This is the expected behavior since exclude filters should remain as literal patterns for filtering.elementary/monitor/data_monitoring/schema.py (4)
192-197: LGTM! Signature changes maintain backward compatibility.The addition of optional
configandtrackingparameters allows graph operator resolution while maintaining backward compatibility for existing callers. UsingOptional[Any]for types is acceptable here to avoid circular import issues with the Config type.
214-214: LGTM! Initialization is consistent with other filter lists.
287-294: LGTM! Return statement correctly includes node_names.The
node_namesfield is properly passed to theFiltersSchemaconstructor along with other filter types.
235-250: Verify behavior when multiple models with graph operators are specified.The logic at lines 235-239 joins multiple model values with a comma. Consider the following scenarios:
models:model1,model2+→ becomes"model1,model2+"→ passed to dbt selectormodels:model1+,model2+→ becomes"model1+,model2+"→ passed to dbt selectorPlease verify that dbt's selector syntax correctly handles comma-separated selectors with graph operators. Based on dbt documentation, graph operators typically work on individual selectors, and comma-separated lists might not behave as expected.
Run this script to test the behavior:
Additionally, consider adding a unit test that explicitly covers the
models:model1+,model2+scenario to document the expected behavior.Note: Line 243 accesses the private method
SelectorFilter._has_graph_operators(). While this works, consider making this a public utility method or moving it to a shared module to improve encapsulation.
|
Closing due to inactivity for more than 7 days. Configure here. |
Add upstream/downstream model selector support with graph operators
Summary
Implements support for dbt-style graph operators (
+,+model,model+,+model+) in model filters foredr monitor, allowing users to filter alerts by upstream and downstream model dependencies. This addresses the feature request in #2046.Key changes:
_has_graph_operators()helper method to detect+symbols in model selectorsSelectorFilter._parse_selector()to usedbt lsfor resolving graph operators when detectedFiltersSchema.from_cli_params()to support graph operators in--filters models:syntaxHow it works:
+(e.g.,models:customers+,models:+customers,models:+customers+), the system now usesdbt lsto resolve the full set of upstream/downstream modelsnode_namesfield for filteringReview & Testing Checklist for Human
edr monitor --filters models:your_model+on an actual dbt project to verify graph operators resolve correctlyFiltersSchema.from_cli_params()without the new optional parameters still works+characters, multiple model filters with graph operators, and combinations of graph operators with other filtersdbt lsfails or when project_dir is not configured--select model:customers+(deprecated) and--filters models:customers+(new) to ensure both workTest Plan
edr monitor --filters models:your_model+and verify alerts are sent for the model and all downstream modelsedr monitor --filters models:+your_modeland verify alerts are sent for the model and all upstream modelsedr monitor --filters models:+your_model+and verify alerts are sent for upstream, the model itself, and downstream models+still work as beforeNotes
"+" in selectorcheck to detect graph operators, which could theoretically have false positives with unusual model namesSelectorFilterinside thefrom_cli_params()methodSession info:
Summary by CodeRabbit
New Features
Bug Fixes
Tests