Skip to content

Conversation

@All-less
Copy link
Contributor

@All-less All-less commented Nov 4, 2025

Description

This PR adds an extra step of query validation to the MCP server. In prior implementations, the LLMs often submit invalid queries and only receive the error after the query has been executed. This PR adds a validation step before executing the query so that the LLMs can get error message earlier.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

cd components/clp-mcp-server
uv run pytest -k test_validator

Summary by CodeRabbit

  • New Features

    • KQL queries are pre-validated before submission; invalid queries return clear syntax error messages.
  • Tests

    • Added automated tests verifying valid and invalid query validation behavior.
  • Chores

    • Added parser-generation tooling and runtime support required for query validation.
  • Style

    • Updated linting tasks to exclude generated parser output from checks and formatting.

@All-less All-less requested a review from a team as a code owner November 4, 2025 21:05
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 4, 2025

Warning

Rate limit exceeded

@All-less has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 22 minutes and 40 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a122c8c and 960814f.

⛔ Files ignored due to path filters (4)
  • components/clp-mcp-server/clp_mcp_server/validator/generated/KqlLexer.py is excluded by !**/generated/**
  • components/clp-mcp-server/clp_mcp_server/validator/generated/KqlListener.py is excluded by !**/generated/**
  • components/clp-mcp-server/clp_mcp_server/validator/generated/KqlParser.py is excluded by !**/generated/**
  • components/clp-mcp-server/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • components/clp-mcp-server/clp_mcp_server/server/server.py (2 hunks)
  • components/clp-mcp-server/clp_mcp_server/validator/__init__.py (1 hunks)
  • components/clp-mcp-server/clp_mcp_server/validator/validate.py (1 hunks)
  • components/clp-mcp-server/pyproject.toml (1 hunks)
  • components/clp-mcp-server/tests/test_validator.py (1 hunks)
  • taskfile.yaml (1 hunks)
  • taskfiles/codegen.yaml (1 hunks)
  • taskfiles/lint.yaml (1 hunks)

Walkthrough

Adds an ANTLR4-based KQL validator, exposes validate_query from the validator package, integrates pre-execution validation into the MCP server, adds parser-generation tasks and dependencies, and introduces unit tests for the validator.

Changes

Cohort / File(s) Change Summary
Validator Implementation
components/clp-mcp-server/clp_mcp_server/validator/validate.py, components/clp-mcp-server/clp_mcp_server/validator/__init__.py
New ANTLR4-based validator module with KqlErrorListener and `validate_query(query: str) -> tuple[bool, str
Server Integration
components/clp-mcp-server/clp_mcp_server/server/server.py
Imports validate_query and performs pre-validation inside _execute_kql_query(); on parse failure returns {"Error": "Error parsing query: <message>"} and avoids submitting the query.
Dependencies & Config
components/clp-mcp-server/pyproject.toml
Adds antlr4-python3-runtime>=4.13 and clp-py-utils to project dependencies and configures a local editable source for clp-py-utils.
Parser Generation Tasks
taskfiles/codegen.yaml, taskfile.yaml
Adds mcp-server-generate-parser task to generate ANTLR parser into validator/generated and adds G_MCP_COMPONENT_DIR variable referencing the MCP component path.
Linting Task Adjustments
taskfiles/lint.yaml
Excludes generated directory from Python linting/formatting checks and fixes by adding --exclude generated to flags.
Tests
components/clp-mcp-server/tests/test_validator.py
New tests validating validate_query() for a valid query and an invalid query case.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Server as MCP Server
    participant Validator
    participant Parser as ANTLR4 Parser

    Client->>Server: _execute_kql_query(query)
    rect rgb(240,248,255)
    Note over Server,Validator: Validation phase (new)
    Server->>Validator: validate_query(query)
    Validator->>Parser: lex & parse
    alt Syntax valid
        Parser-->>Validator: success
        Validator-->>Server: (True, None)
    else Syntax error
        Parser-->>Validator: error
        Validator-->>Server: (False, error_msg)
        Server-->>Client: {"Error": "Error parsing query: ..."}
    end
    end
    rect rgb(240,255,240)
    Note over Server: Execution phase (existing)
    Server->>Server: submit query to backend
    Server-->>Client: results
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Inspect validator/validate.py for correct ANTLR error-listener wiring and clear error messages.
  • Verify _execute_kql_query() handles validation return values and error formatting consistently.
  • Review taskfiles/codegen.yaml ANTLR invocation, generated-folder pruning, and checksum logic.
  • Confirm pyproject.toml entries and that tests in tests/test_validator.py cover typical success/failure cases.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR: adding query validation functionality to KQL-related tool calls in the clp-mcp-server component.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f798f1f and 4e54a2c.

⛔ Files ignored due to path filters (4)
  • components/clp-mcp-server/clp_mcp_server/validator/generated/KqlLexer.py is excluded by !**/generated/**
  • components/clp-mcp-server/clp_mcp_server/validator/generated/KqlListener.py is excluded by !**/generated/**
  • components/clp-mcp-server/clp_mcp_server/validator/generated/KqlParser.py is excluded by !**/generated/**
  • components/clp-mcp-server/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • components/clp-mcp-server/clp_mcp_server/server/server.py (2 hunks)
  • components/clp-mcp-server/clp_mcp_server/validator/__init__.py (1 hunks)
  • components/clp-mcp-server/clp_mcp_server/validator/validate.py (1 hunks)
  • components/clp-mcp-server/pyproject.toml (1 hunks)
  • components/clp-mcp-server/tests/test_validator.py (1 hunks)
  • taskfile.yaml (1 hunks)
  • taskfiles/codegen.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-06-02T18:22:24.060Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.

Applied to files:

  • components/clp-mcp-server/pyproject.toml
📚 Learning: 2025-10-27T04:05:01.560Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1507
File: taskfiles/deps/lock.yaml:66-76
Timestamp: 2025-10-27T04:05:01.560Z
Learning: In the CLP project, the `uv` CLI tool is documented as a prerequisite in docs/src/dev-docs/building-package.md and is not installed via a task dependency like the rust toolchain. UV is expected to be available in the environment rather than being provisioned by the taskfile system.

Applied to files:

  • components/clp-mcp-server/pyproject.toml
📚 Learning: 2025-05-28T18:33:30.155Z
Learnt from: anlowee
Repo: y-scope/clp PR: 925
File: taskfiles/deps/main.yaml:97-106
Timestamp: 2025-05-28T18:33:30.155Z
Learning: In the taskfiles dependency system (taskfiles/deps/main.yaml), echo commands are used to generate .cmake settings files that are consumed by the main CMake build process. These files set variables like ANTLR_RUNTIME_HEADER to point to dependency locations for use during compilation.

Applied to files:

  • taskfiles/codegen.yaml
🧬 Code graph analysis (4)
components/clp-mcp-server/clp_mcp_server/server/server.py (1)
components/clp-mcp-server/clp_mcp_server/validator/validate.py (1)
  • validate_query (35-55)
components/clp-mcp-server/tests/test_validator.py (1)
components/clp-mcp-server/clp_mcp_server/validator/validate.py (1)
  • validate_query (35-55)
components/clp-mcp-server/clp_mcp_server/validator/__init__.py (1)
components/clp-mcp-server/clp_mcp_server/validator/validate.py (1)
  • validate_query (35-55)
components/clp-mcp-server/clp_mcp_server/validator/validate.py (2)
components/clp-mcp-server/clp_mcp_server/validator/generated/KqlLexer.py (1)
  • KqlLexer (79-125)
components/clp-mcp-server/clp_mcp_server/validator/generated/KqlParser.py (12)
  • KqlParser (41-753)
  • column (186-187)
  • column (448-449)
  • column (514-515)
  • column (600-614)
  • query (103-104)
  • query (189-190)
  • query (211-212)
  • query (231-232)
  • query (253-256)
  • query (273-366)
  • start (124-140)
🪛 Ruff (0.14.3)
components/clp-mcp-server/tests/test_validator.py

1-1: Missing docstring in public module

(D100)


5-5: Missing docstring in public function

(D103)

components/clp-mcp-server/clp_mcp_server/validator/__init__.py

1-1: Missing docstring in public package

(D104)

components/clp-mcp-server/clp_mcp_server/validator/validate.py

13-13: Missing docstring in __init__

(D107)


14-14: Use super() instead of super(__class__, self)

Remove super() parameters

(UP008)


16-16: Function name syntaxError should be lowercase

(N802)


16-16: Too many arguments in function definition (6 > 5)

(PLR0913)


16-16: Missing docstring in public method

(D102)


16-16: Unused method argument: recognizer

(ARG002)


16-16: Argument name offendingSymbol should be lowercase

(N803)


16-16: Unused method argument: offendingSymbol

(ARG002)


16-16: Unused method argument: e

(ARG002)


17-17: Create your own exception

(TRY002)


17-17: Avoid specifying long messages outside the exception class

(TRY003)


17-17: Exception must not use an f-string literal, assign to variable first

Assign to variable; remove f-string literal

(EM102)


19-19: Function name reportAmbiguity should be lowercase

(N802)


19-19: Too many arguments in function definition (7 > 5)

(PLR0913)


19-19: Missing docstring in public method

(D102)


20-20: Argument name startIndex should be lowercase

(N803)


20-20: Argument name stopIndex should be lowercase

(N803)


20-20: Argument name ambigAlts should be lowercase

(N803)


24-24: Function name reportAttemptingFullContext should be lowercase

(N802)


24-24: Too many arguments in function definition (6 > 5)

(PLR0913)


24-24: Missing docstring in public method

(D102)


25-25: Argument name startIndex should be lowercase

(N803)


25-25: Argument name stopIndex should be lowercase

(N803)


25-25: Argument name conflictingAlts should be lowercase

(N803)


29-29: Function name reportContextSensitivity should be lowercase

(N802)


29-29: Too many arguments in function definition (6 > 5)

(PLR0913)


29-29: Missing docstring in public method

(D102)


30-30: Argument name startIndex should be lowercase

(N803)


30-30: Argument name stopIndex should be lowercase

(N803)


53-53: Consider moving this statement to an else block

(TRY300)


54-54: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: rust-checks (macos-15)
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-22.04)
🔇 Additional comments (7)
taskfile.yaml (1)

22-22: LGTM!

The addition of G_MCP_COMPONENT_DIR follows the existing pattern for component directory variables and is properly integrated with the new parser generation task.

components/clp-mcp-server/pyproject.toml (2)

15-16: LGTM!

The dependencies are correctly added:

  • clp-py-utils is configured as a local editable dependency (see line 20).
  • antlr4-python3-runtime>=4.13 ensures compatibility with the generated parser code.

19-21: LGTM!

The [tool.uv.sources] configuration properly sets up clp-py-utils as an editable local dependency, which is the correct pattern for monorepo component dependencies.

components/clp-mcp-server/clp_mcp_server/server/server.py (2)

11-11: LGTM!

The import correctly references the validator module's public API.


51-54: LGTM!

Pre-validation prevents invalid queries from being submitted to the backend, providing faster feedback to users. The error message format is consistent with existing error handling in this file.

components/clp-mcp-server/clp_mcp_server/validator/validate.py (2)

1-7: LGTM!

The imports are well-organized and correctly reference the ANTLR4 runtime and generated parser components.


35-55: LGTM!

The validate_query function correctly:

  • Initializes the ANTLR lexer and parser
  • Replaces default error handling with the custom KqlErrorListener
  • Parses the query and returns a validation result tuple

The broad exception handling (line 54) is appropriate here—it catches syntax errors from KqlErrorListener and any ANTLR parsing exceptions, then returns a structured error response rather than propagating the exception.

Note: The static analysis warnings about exception handling (TRY300, BLE001) are false positives in this context.

Comment on lines +5 to +15
def test_validate_query():
valid_query = "Node AND ready"
invalid_query = "Node became not ready" # Invalid operator

is_valid, error_msg = validate_query(valid_query)
assert is_valid is True
assert error_msg is None

is_valid, error_msg = validate_query(invalid_query)
assert is_valid is False
assert error_msg is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider expanding test coverage.

The test validates basic valid/invalid query scenarios. Consider adding test cases for:

  • Empty or whitespace-only queries
  • Nested expressions with parentheses
  • Field:value syntax
  • Edge cases with special characters

Example additional tests:

def test_validate_query_edge_cases():
    # Empty query
    is_valid, error_msg = validate_query("")
    assert is_valid is False
    
    # Field:value syntax
    is_valid, error_msg = validate_query("field:value")
    assert is_valid is True
    
    # Nested expression
    is_valid, error_msg = validate_query("(Node OR Pod) AND ready")
    assert is_valid is True
🧰 Tools
🪛 Ruff (0.14.3)

5-5: Missing docstring in public function

(D103)

🤖 Prompt for AI Agents
In components/clp-mcp-server/tests/test_validator.py around lines 5-15, the
current test covers only a basic valid/invalid query; expand coverage by adding
a new test function that asserts expected outcomes for empty and whitespace-only
queries (expect invalid and non-empty error message), field:value syntax (expect
valid), nested expressions with parentheses like "(Node OR Pod) AND ready"
(expect valid), and a couple of edge cases with special characters (e.g.,
symbols or escaped characters — assert whether validate_query should accept or
reject them based on current validator rules). For each case call
validate_query(query) and assert the boolean validity and presence/absence of
error_msg according to the expected result to ensure clear failures in CI.

@All-less All-less force-pushed the add-query-validator branch 13 times, most recently from 945a103 to 24066ef Compare November 4, 2025 22:08
@All-less All-less changed the title feat(clp-mcp-server): Add query validation to KQL tool calls feat(clp-mcp-server): Add query validation to KQL-related tool calls Nov 4, 2025
@All-less All-less force-pushed the add-query-validator branch from 24066ef to 960814f Compare November 4, 2025 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant