use sets for alert filters #1994

ofek1weiss · 2025-08-24T10:03:47Z

null

Summary by CodeRabbit

New Features
- Improved filtering: more accurate IS/IS NOT and CONTAINS/NOT CONTAINS matching with normalized, multi-value comparisons for better results.
- Selector now exposes singular fields (tag, owner, model) for quicker filtering.
Refactor
- Streamlined, set-based filter flow for better performance and consistency on large datasets.
- Negative filters now pass when no values are provided, yielding more intuitive behavior.

linear · 2025-08-24T10:03:50Z

ELE-4990 use sets for alert filters

coderabbitai · 2025-08-24T10:03:54Z

Caution

Review failed

The pull request is closed.

Walkthrough

Replaces procedural per-value filtering with set-based, normalized matching in elementary/monitor/data_monitoring/schema.py: removes module-level apply_filter, adds NEGATIVE_OPERATORS, cached normalized values, new FilterSchema APIs (get_matching_values, apply_filter_on_values, apply_filter_on_value), updates FiltersSchema.test_ids typing and selector field mappings.

Changes

Cohort / File(s)	Summary of changes
Filtering schema refactor `elementary/monitor/data_monitoring/schema.py`	Removed module-level `apply_filter(...)`. Added `NEGATIVE_OPERATORS`. Reworked `FilterSchema` by adding `normalize_value(s)`, cached `_normalized_values`, `get_matching_normalized_values`, `get_matching_values`, `apply_filter_on_values`, `apply_filter_on_value`; removed `_apply_filter_type`. Updated `FiltersSchema.test_ids` to `List[FilterSchema[str]]` and changed `to_selector_filter_schema` to populate singular `tag`, `owner`, `model` from plural fields.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant FS as FilterSchema
  participant Cache as cached props

  rect rgba(231,243,255,0.6)
    Caller->>FS: apply_filter_on_values(values)
    FS->>FS: normalize_values(values)
    FS->>Cache: access _normalized_values (cached)
    FS->>FS: get_matching_normalized_values(normalized_input)
    alt operator IS / CONTAINS
      FS-->>Caller: return True if any match
    else operator IS_NOT / NOT_CONTAINS
      opt input empty
        Note over FS: negative operators pass empty input
      end
      FS-->>Caller: return True if no matches
    end
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

use sets for alert filters #1994 — similar refactor in the same file: removes module-level apply_filter and introduces set-based FilterSchema APIs.

Suggested reviewers

MikaKerman

Poem

I hop through sets, not loops today,
Cached crumbs keep mismatches at bay.
IS, CONTAINS, or NOT — I peek and see,
Empty fields let negations be.
Tags, owners, models neat in a row — a tidy burrow, ready to go. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4f8ea18 and 80a38e6.

📒 Files selected for processing (1)

elementary/monitor/data_monitoring/schema.py (3 hunks)

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ele-4990-use-sets-for-alert-filters

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

github-actions · 2025-08-24T10:03:57Z

👋 @ofek1weiss
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in this pull request.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

elementary/monitor/data_monitoring/schema.py (4)

57-60: Bug: normalized_status is always empty due to incorrect membership check

status is a str and list(Status) yields enum members, so the membership test fails and drops all statuses. This causes FiltersSchema.apply to reject everything when a status filter exists.

Apply this diff to fix:

-    def normalized_status(self) -> List[Status]:
-        return [Status(status) for status in self.statuses if status in list(Status)]
+    def normalized_status(self) -> List[Status]:
+        normalized: List[Status] = []
+        for s in self.statuses:
+            try:
+                normalized.append(Status(s))
+            except ValueError:
+                # ignore invalid values
+                continue
+        return normalized

48-56: Mutable default lists in Pydantic model fields

Using [] as a default creates a shared mutable default. We already use Field(default_factory=list) elsewhere; please align these.

Apply this diff:

-class FilterFields(BaseModel):
-    tags: List[str] = []
-    models: List[str] = []
-    owners: List[str] = []
-    statuses: List[str] = []
-    resource_types: List[ResourceType] = []
-    node_names: List[str] = []
-    test_ids: List[str] = []
+class FilterFields(BaseModel):
+    tags: List[str] = Field(default_factory=list)
+    models: List[str] = Field(default_factory=list)
+    owners: List[str] = Field(default_factory=list)
+    statuses: List[str] = Field(default_factory=list)
+    resource_types: List[ResourceType] = Field(default_factory=list)
+    node_names: List[str] = Field(default_factory=list)
+    test_ids: List[str] = Field(default_factory=list)

155-157: Avoid shared default list for statuses

Field(default=_get_default_statuses_filter()) evaluates at import time and shares a list across instances.

Apply this diff:

-    statuses: List[StatusFilterSchema] = Field(default=_get_default_statuses_filter())
+    statuses: List[StatusFilterSchema] = Field(default_factory=_get_default_statuses_filter)

344-359: Mutable default for SelectorFilterSchema.statuses

Same mutable-default issue here; use Field(default_factory=...) to avoid shared state.

Apply this diff:

 class SelectorFilterSchema(BaseModel):
@@
-    statuses: Optional[List[Status]] = [
-        Status.FAIL,
-        Status.ERROR,
-        Status.RUNTIME_ERROR,
-        Status.WARN,
-    ]
+    statuses: Optional[List[Status]] = Field(
+        default_factory=lambda: [
+            Status.FAIL,
+            Status.ERROR,
+            Status.RUNTIME_ERROR,
+            Status.WARN,
+        ]
+    )

🧹 Nitpick comments (8)

elementary/monitor/data_monitoring/schema.py (8)
67-67: Avoid duplicating operator groups

NEGATIVE_OPERATORS duplicates ALL_OPERATORS. Reuse the existing constant to keep semantics in one place.

Apply this diff:
-NEGATIVE_OPERATORS = [FilterType.IS_NOT, FilterType.NOT_CONTAINS]
+NEGATIVE_OPERATORS = ALL_OPERATORS
79-86: cached_property can get stale if FilterSchema.values mutates at runtime

Both caches depend on self.values. If the model is mutated post-init, caches won’t refresh. If these models are meant to be immutable, make that explicit to avoid subtle bugs.

Option A (preferred): freeze the model to protect the caches (v1-style config via our shim):
 class FilterSchema(BaseModel, Generic[ValueT]):
@@
     class Config:
         # Make sure that serializing Enum return values
         use_enum_values = True
+        allow_mutation = False
Option B: if mutation is required, override setattr to invalidate cached attributes when values changes.

88-117: Double-check NOT semantics and case-sensitivity changes

For IS_NOT/NOT_CONTAINS you return an empty set if any value violates the filter, effectively requiring all input values to satisfy the negative condition. That’s stricter than “any value is acceptable” semantics. Please confirm this is intentional across all call sites.

IS/IS_NOT perform exact equality with the raw objects. For strings, that is now case-sensitive. If previous behavior was case-insensitive, this is a breaking change.

If case-insensitive equality is desired for strings, one approach is to pre-normalize string-only subsets:
# sketch (not a diff): build a mapping of original->lower and compare on the lowered views
119-126: Empty input handling looks correct; broaden type for flexibility

The “negative operator + empty input => True” rule matches the spec in the PR summary. Minor nit: accept Iterable[ValueT] for symmetry with get_matching_values.

Apply this diff:
-    def apply_filter_on_values(self, values: List[ValueT]) -> bool:
+    def apply_filter_on_values(self, values: Iterable[ValueT]) -> bool:
152-158: Specify generics for string-based filters for clarity

test_ids is typed as List[FilterSchema[str]]. Consider aligning tags/owners/models similarly for consistent type hints.

Apply this diff:
-    tags: List[FilterSchema] = Field(default_factory=list)
-    owners: List[FilterSchema] = Field(default_factory=list)
-    models: List[FilterSchema] = Field(default_factory=list)
+    tags: List[FilterSchema[str]] = Field(default_factory=list)
+    owners: List[FilterSchema[str]] = Field(default_factory=list)
+    models: List[FilterSchema[str]] = Field(default_factory=list)
274-279: Trim CLI filter tokens and drop empties

A filter like "tags:a, b" will include " b" with a leading space. Strip tokens and ignore empty entries.

Apply this diff:
     def _match_filter_regex(filter_string: str, regex: Pattern) -> List[str]:
         match = regex.search(filter_string)
         if match:
-            return match.group(1).split(",")
+            return [t.strip() for t in match.group(1).split(",") if t.strip()]
         return []
286-301: Selector singulars should prefer positive filters

When both positive and negative filters exist, picking the first value regardless of type can misrepresent the selection (e.g., tag="foo" when the active filter is is_not foo). Prefer the first IS filter; fall back only if none exist.

Apply this diff:
-        tags = self.tags[0].values[0] if self.tags else None
-        owners = self.owners[0].values[0] if self.owners else None
-        models = self.models[0].values[0] if self.models else None
+        tags = next((f.values[0] for f in self.tags if f.type == FilterType.IS and f.values), None)
+        owners = next((f.values[0] for f in self.owners if f.type == FilterType.IS and f.values), None)
+        models = next((f.values[0] for f in self.models if f.type == FilterType.IS and f.values), None)
305-341: End-to-end behavior verification recommended

Given the set-based rewrite, please add/adjust tests to pin the following:

Negative filters with mixed values: e.g., owners is_not ["alice"] against ["bob","alice"] must fail.

Empty input with negative filters returns True.

Case-sensitivity expectations for IS/IS_NOT on strings.

Status normalization once fixed.

I can draft unit tests targeting FiltersSchema.apply and FilterSchema.get_matching_values for these scenarios if helpful.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d7a4fdc and 4f8ea18.

📒 Files selected for processing (1)

elementary/monitor/data_monitoring/schema.py (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: code-quality

🔇 Additional comments (2)

elementary/monitor/data_monitoring/schema.py (2)

4-4: Confirm Python version for functools.cached_property

cached_property is available in Python 3.8+. Please confirm our minimum supported Python version; if we support <3.8 anywhere, we’d need a fallback (e.g., backports.cached_property).

5-14: LGTM on typing imports

The expanded typing imports are appropriate for the new generic/set-based approach.

use sets for alert filters

4f8ea18

ofek1weiss had a problem deploying to elementary_test_env August 24, 2025 10:03 — with GitHub Actions Error

coderabbitai bot reviewed Aug 24, 2025

View reviewed changes

extract normalization

6a17c4f

ofek1weiss had a problem deploying to elementary_test_env August 24, 2025 13:01 — with GitHub Actions Error

MikaKerman approved these changes Aug 24, 2025

View reviewed changes

isort

80a38e6

ofek1weiss temporarily deployed to elementary_test_env August 24, 2025 13:08 — with GitHub Actions Inactive

ofek1weiss merged commit 792cd1b into master Aug 24, 2025
4 of 5 checks passed

ofek1weiss deleted the ele-4990-use-sets-for-alert-filters branch August 24, 2025 13:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use sets for alert filters #1994

use sets for alert filters #1994

Uh oh!

ofek1weiss commented Aug 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

linear bot commented Aug 24, 2025

Uh oh!

coderabbitai bot commented Aug 24, 2025 •

edited

Loading

Review failed

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

github-actions bot commented Aug 24, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

use sets for alert filters #1994

use sets for alert filters #1994

Uh oh!

Conversation

ofek1weiss commented Aug 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

linear bot commented Aug 24, 2025

Uh oh!

coderabbitai bot commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

github-actions bot commented Aug 24, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ofek1weiss commented Aug 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 24, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)