Skip to content

Conversation

@All-less
Copy link
Contributor

@All-less All-less commented Oct 22, 2025

Description

This PR adds integration tests for the MCP server.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

All the tests have been manually tested.

Summary by CodeRabbit

  • Tests

    • Added asynchronous integration tests for MCP server workflows (instructions, search, pagination, timestamped searches and log links), marked for manual/conditional execution.
  • Chores

    • Added dev dependencies to support async MCP testing.
    • Updated pytest configuration to document an MCP test marker and exclude those tests by default.

@All-less All-less requested a review from a team as a code owner October 22, 2025 19:38
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 22, 2025

Walkthrough

Adds an async integration test module exercising a CLP MCP server (instructions, KQL search, pagination, timestamped queries, log links), adds dev deps fastmcp and pytest-asyncio, and introduces an mcp pytest marker excluded by default via .pytest.ini.

Changes

Cohort / File(s) Change Summary
Test Module
integration-tests/tests/test_mcp_server.py
New asynchronous integration test module with tests for get_instructions, search_by_kql, pagination (get_nth_page), timestamped search scenarios (success, invalid timestamps, query failures), and verifying log links. Tests are marked mcp for conditional/manual execution.
Dependencies
integration-tests/pyproject.toml
Added dev dependencies: fastmcp>=2.12.4 and pytest-asyncio>=1.2.0.
Pytest configuration
integration-tests/.pytest.ini
Added mcp marker description and added -m "not mcp" to addopts to exclude mcp-marked tests by default.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Test as Test Runner
    participant Client as FastMCP Client
    participant Server as CLP MCP Server

    rect rgb(235,245,255)
      Note over Test,Client: Async call - get_instructions
      Test->>Client: async get_instructions()
      Client->>Server: request get_instructions
      Server-->>Client: instructions (string)
      Client-->>Test: returns instructions
    end

    rect rgb(245,255,235)
      Note over Test,Client: Search and pagination
      Test->>Client: async search_by_kql(query, page=1)
      Client->>Server: search request (KQL, page=1)
      Server-->>Client: results + pagination metadata
      Client-->>Test: returns results
      Test->>Client: async get_nth_page(page=n)
      Client->>Server: fetch page n
      Server-->>Client: page n results + metadata
      Client-->>Test: returns page n
    end

    rect rgb(255,245,235)
      Note over Test,Client: Timestamped search & error cases
      Test->>Client: async search_by_kql_with_timestamp_range(...)
      Client->>Server: search with timestamp filters
      alt valid timestamps
        Server-->>Client: filtered results
      else invalid timestamp
        Server-->>Client: error response
      end
      Client-->>Test: results or error
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review test assumptions and hard-coded MCP_SERVER_ENDPOINT.
  • Verify correct use of pytest-asyncio fixtures/markers and async patterns.
  • Confirm .pytest.ini addopts change does not unintentionally exclude other test categories.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat(integration-tests): Add integration tests for MCP server" directly and accurately describes the main change in the changeset. All three modified files contribute to a single cohesive objective: adding integration tests for the MCP server, with the new test module containing multiple test cases, the pyproject.toml adding required dependencies, and the pytest.ini configuring test markers for proper test organization. The title is concise, uses the Conventional Commits format appropriately with a clear scope and subject, and provides sufficient specificity for developers scanning project history to quickly understand the primary change.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50bd0a7 and 904b83a.

⛔ Files ignored due to path filters (1)
  • integration-tests/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • integration-tests/.pytest.ini (2 hunks)
  • integration-tests/pyproject.toml (1 hunks)
  • integration-tests/tests/test_mcp_server.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (5)
integration-tests/pyproject.toml (1)

24-25: LGTM! Dependencies are current and appropriate.

The specified versions match the latest releases (pytest-asyncio 1.2.0 from September 12, 2025 and fastmcp released October 26, 2025), and both packages are actively maintained and secure for async MCP testing.

integration-tests/.pytest.ini (1)

9-9: LGTM! Test exclusion configuration is appropriate.

The default exclusion of MCP-marked tests aligns with the documented requirement for manual execution against a deployed CLP package, preventing accidental execution of tests that require external infrastructure.

Also applies to: 22-22

integration-tests/tests/test_mcp_server.py (3)

1-32: LGTM! Comprehensive documentation for manual testing.

The module docstring provides clear instructions for environment setup, dataset preparation, and test execution, making it easy for developers to run these manual integration tests.


92-113: LGTM! Success case is properly tested.

The test validates the structure of timestamp-ranged search results with appropriate assertions for pagination fields.


115-130: LGTM! Invalid timestamp handling is tested.

The test correctly validates that swapped timestamps (begin > end) produce an error response.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8cebaa4 and a557901.

📒 Files selected for processing (1)
  • integration-tests/tests/test_mcp_server.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
integration-tests/tests/test_mcp_server.py (1)
components/core/src/clp_s/search/QueryRunner.hpp (1)
  • item (298-304)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (2)
integration-tests/tests/test_mcp_server.py (2)

32-38: LGTM!

The test correctly uses an async context manager and validates the response type. The structure is appropriate for an integration test.


84-105: Fix timestamps that are set in the future.

The timestamps "2025-08-27T15:35:30.000Z" and "2025-08-27T15:35:50.000Z" (lines 95-96) are in the future relative to the current date (October 22, 2025). This will likely cause the test to fail or return no results, as logs from August 2025 wouldn't exist yet. Based on the test dataset description (HDFS-11896 case study), these should probably be "2024-08-27" to query historical data.

Apply this diff to fix the timestamps:

                 "search_kql_query_with_timestamp",
                 {
                     "kql_query": query,
-                    "begin_timestamp": "2025-08-27T15:35:30.000Z",
-                    "end_timestamp": "2025-08-27T15:35:50.000Z"
+                    "begin_timestamp": "2024-08-27T15:35:30.000Z",
+                    "end_timestamp": "2024-08-27T15:35:50.000Z"
                 }
             )

Likely an incorrect or invalid review comment.

sbin/compress.sh --timestamp-key ts CLODS/HDFS/HDFS-11896/json/hadoop-dn1__logs__hadoop.jsonl
```
"""
MCP_SERVER_ENDPOINT = "http://0.0.0.0:8000/mcp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider making the endpoint configurable via environment variable.

While the hardcoded endpoint is acceptable for manual tests, consider using an environment variable with this as a fallback (e.g., os.getenv("MCP_SERVER_ENDPOINT", "http://0.0.0.0:8000/mcp")). This would provide flexibility for different deployment configurations without modifying the code.

🤖 Prompt for AI Agents
integration-tests/tests/test_mcp_server.py around line 29: the MCP server
endpoint is hardcoded which reduces flexibility; change it to read from an
environment variable with the current value as default (use
os.getenv("MCP_SERVER_ENDPOINT", "http://0.0.0.0:8000/mcp")), add an import for
os at the top if missing, and ensure any tests or fixtures that reference
MCP_SERVER_ENDPOINT use this variable so the endpoint can be overridden in CI or
local runs.

Comment on lines 60 to 76
@pytest.mark.skip(reason="expected to run manually against a deployed CLP package")
@pytest.mark.asyncio
async def test_get_nth_page() -> None:
"""Tests the get_nth_page tool."""
async with Client(MCP_SERVER_ENDPOINT) as client:
await client.call_tool("get_instructions")

query = "*filter*"
await client.call_tool("search_kql_query", {"kql_query": query})

result = await client.call_tool("get_nth_page", {"page_index": 1})
assert "items" in result.data
assert isinstance(result.data["items"], list)
assert "num_total_items" in result.data
assert "num_total_pages" in result.data
assert "num_items_per_page" in result.data
assert "has_next" in result.data
assert "has_previous" in result.data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider validating pagination behaviour.

Beyond checking field presence, consider asserting that page_index 1 returns different results than page 0, or that has_previous is True when requesting page 1. This would more thoroughly validate the pagination functionality.

🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 60 to 77, the test
only checks presence/types of pagination fields but doesn't validate actual
pagination behavior; update the test to call get_nth_page for page_index 0 and 1
(or capture the prior page state), assert that the returned "items" lists differ
between pages (or that page 1 is not equal to page 0), and assert has_previous
is True for page_index 1 and has_next is False/True according to num_total_pages
(or at least that has_previous is True). Ensure you handle empty results by
first verifying num_total_pages > 1 (skip assertions if not) to avoid flaky
failures.

Comment on lines 124 to 150
@pytest.mark.asyncio
async def test_query_failure(self) -> None:
"""Tests the search_kql_query_with_timestamp tool with a query that is killed."""
async with Client(MCP_SERVER_ENDPOINT) as client:
await client.call_tool("get_instructions")

query = "*filter*"
result = await client.call_tool(
"search_kql_query_with_timestamp",
{
"kql_query": query,
# This test (ab)uses a design in CLP: when the begin timestamp is set before
# Unix epoch, the query will be killed. It works as of Oct. 22, 2025. The
# future versions of CLP may change this behavior.
"begin_timestamp": "1970-01-01T00:00:00.000Z",
"end_timestamp": "1970-01-02T00:00:00.000Z"
}
)
assert "Error" in result.data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fragile test relies on undocumented behaviour.

The test_query_failure test relies on an undocumented implementation detail (queries being killed when begin timestamp is before Unix epoch) that is explicitly marked as unstable behaviour. While the comment acknowledges this, consider either:

  1. Using a more reliable way to trigger a query failure (e.g., invalid KQL syntax, if supported)
  2. Mocking the server response to simulate a failure
  3. Accepting that this test may need updates when CLP behaviour changes and documenting it clearly as a known fragility

The current approach creates technical debt and may cause unexpected test failures in future CLP versions.

🤖 Prompt for AI Agents
integration-tests/tests/test_mcp_server.py around lines 124 to 142: the test
currently relies on an unstable CLP behavior (killing queries with begin
timestamp before Unix epoch); replace this with a reliable failure trigger by
using an explicitly invalid KQL query string (e.g., malformed syntax) or by
mocking the client/server response to simulate an error; update the test to call
search_kql_query_with_timestamp with the invalid query (or a mocked failure
response) and assert that the result contains "Error", and update the inline
comment to explain the chosen reliable approach and why it replaces the fragile
timestamp-based trigger.

Comment on lines 145 to 155
@pytest.mark.skip(reason="expected to run manually against a deployed CLP package")
@pytest.mark.asyncio
async def test_log_link() -> None:
"""Tests that the search results contain log links."""
async with Client(MCP_SERVER_ENDPOINT) as client:
await client.call_tool("get_instructions")

query = "*error*"
result = await client.call_tool("search_kql_query", {"kql_query": query})

for item in result.data["items"]:
assert "link" in item
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider asserting that items list is not empty.

If the search query returns no results, the test will pass trivially without validating the link field. Consider adding an assertion like assert len(result.data["items"]) > 0 before the loop to ensure the test actually validates the expected behaviour.

🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 145 to 156, the test
iterates over result.data["items"] and asserts each item contains a "link" but
does not verify the items list is non-empty; add an assertion that
result.data["items"] contains at least one element (e.g., assert
len(result.data["items"]) > 0 or assert result.data["items"]) immediately before
the for-loop so the test fails if the search returns no results and thus
actually validates the link presence.

@All-less All-less force-pushed the mcp-integration-test branch from a557901 to c4f9444 Compare October 22, 2025 19:48
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a557901 and c4f9444.

📒 Files selected for processing (1)
  • integration-tests/tests/test_mcp_server.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (2)
integration-tests/tests/test_mcp_server.py (2)

85-105: Verify test dataset timestamp coverage.

The test uses timestamps from 2025-08-27, which are in the past relative to the current date (October 2025). Ensure that the test dataset (HDFS-11896 logs) actually contains log entries within this timestamp range. If the logs are from a different time period, the test may pass trivially with zero results.


45-46: Document the purpose of get_instructions calls, or remove if unnecessary.

The call at line 46 is part of a consistent pattern—all test functions call get_instructions before their actual test logic but discard the result. If this serves as server initialization or connectivity verification, add a comment explaining the purpose. Otherwise, consider removing these calls to simplify tests.

Comment on lines +34 to +46
async def test_get_instructions() -> None:
"""Tests the get_instructions tool."""
async with Client(MCP_SERVER_ENDPOINT) as client:
res = await client.call_tool("get_instructions")
assert isinstance(res.data, str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider validating instruction content.

The test only checks that res.data is a string. For improved robustness, consider also asserting that the string is non-empty and optionally contains expected keywords or structure (e.g., mentions of available tools or usage guidance).

Apply this diff to add basic content validation:

     async with Client(MCP_SERVER_ENDPOINT) as client:
         res = await client.call_tool("get_instructions")
         assert isinstance(res.data, str)
+        assert len(res.data) > 0, "Instructions should not be empty"
🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 34 to 38, the test
only asserts that res.data is a string; update it to also assert the string is
non-empty and contains expected guidance keywords to validate content. After
fetching res = await client.call_tool("get_instructions"), add assertions such
as assert res.data.strip() != "" and one or two keyword checks (e.g., assert
"tools" in res.data.lower() or assert "usage" in res.data.lower()) so the test
verifies meaningful instruction content.

@All-less All-less force-pushed the mcp-integration-test branch 4 times, most recently from f00224d to 8caa93a Compare October 22, 2025 20:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9056827 and 8caa93a.

⛔ Files ignored due to path filters (1)
  • integration-tests/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • integration-tests/.python-version (1 hunks)
  • integration-tests/pyproject.toml (2 hunks)
  • integration-tests/tests/test_mcp_server.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)

Comment on lines 129 to 150
query = "*filter*"
result = await client.call_tool(
"search_kql_query_with_timestamp",
{
"kql_query": query,
# This test (ab)uses a design in CLP: when the begin timestamp is set before
# Unix epoch, the query will be killed. It works as of Oct. 22, 2025. The
# future versions of CLP may change this behavior.
"begin_timestamp": "1970-01-01T00:00:00.000Z",
"end_timestamp": "1970-01-02T00:00:00.000Z",
},
)
assert "Error" in result.data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove reliance on undocumented timestamp kill-switch.

The failure case currently depends on CLP killing queries whose begin timestamp predates the Unix epoch. That behaviour is explicitly called out as unstable, so this test will start failing (or worse, silently pass) as soon as CLP adjusts its timestamp handling. Please switch to a deterministic trigger—e.g., issue a syntactically invalid KQL query or otherwise provoke a guaranteed server-side error that’s covered by public contract—so the test keeps signalling real regressions instead of depending on internal quirks.

🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 129 to 141, the test
currently relies on CLP killing queries whose begin_timestamp predates the Unix
epoch (an unstable implementation detail); replace this fragile trigger with a
deterministic server-side error such as sending a syntactically invalid KQL
query (e.g., malformed syntax or an obviously invalid token) or another
documented error condition, remove or use valid timestamps instead, call the
tool with the invalid query payload, and keep the existing assertion that an
error is returned (assert "Error" in result.data) so the test fails reliably
when server behavior regresses.

@All-less All-less force-pushed the mcp-integration-test branch 2 times, most recently from 421100d to 50bd0a7 Compare October 27, 2025 18:03
@All-less All-less force-pushed the mcp-integration-test branch from 50bd0a7 to 904b83a Compare October 28, 2025 12:45
@Bill-hbrhbr Bill-hbrhbr self-requested a review November 5, 2025 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant