-
Notifications
You must be signed in to change notification settings - Fork 83
feat(integration-tests): Add integration tests for MCP server #1469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds an async integration test module exercising a CLP MCP server (instructions, KQL search, pagination, timestamped queries, log links), adds dev deps Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as Test Runner
participant Client as FastMCP Client
participant Server as CLP MCP Server
rect rgb(235,245,255)
Note over Test,Client: Async call - get_instructions
Test->>Client: async get_instructions()
Client->>Server: request get_instructions
Server-->>Client: instructions (string)
Client-->>Test: returns instructions
end
rect rgb(245,255,235)
Note over Test,Client: Search and pagination
Test->>Client: async search_by_kql(query, page=1)
Client->>Server: search request (KQL, page=1)
Server-->>Client: results + pagination metadata
Client-->>Test: returns results
Test->>Client: async get_nth_page(page=n)
Client->>Server: fetch page n
Server-->>Client: page n results + metadata
Client-->>Test: returns page n
end
rect rgb(255,245,235)
Note over Test,Client: Timestamped search & error cases
Test->>Client: async search_by_kql_with_timestamp_range(...)
Client->>Server: search with timestamp filters
alt valid timestamps
Server-->>Client: filtered results
else invalid timestamp
Server-->>Client: error response
end
Client-->>Test: results or error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (3)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
integration-tests/tests/test_mcp_server.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
integration-tests/tests/test_mcp_server.py (1)
components/core/src/clp_s/search/QueryRunner.hpp (1)
item(298-304)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: package-image
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (2)
integration-tests/tests/test_mcp_server.py (2)
32-38: LGTM!The test correctly uses an async context manager and validates the response type. The structure is appropriate for an integration test.
84-105: Fix timestamps that are set in the future.The timestamps
"2025-08-27T15:35:30.000Z"and"2025-08-27T15:35:50.000Z"(lines 95-96) are in the future relative to the current date (October 22, 2025). This will likely cause the test to fail or return no results, as logs from August 2025 wouldn't exist yet. Based on the test dataset description (HDFS-11896 case study), these should probably be"2024-08-27"to query historical data.Apply this diff to fix the timestamps:
"search_kql_query_with_timestamp", { "kql_query": query, - "begin_timestamp": "2025-08-27T15:35:30.000Z", - "end_timestamp": "2025-08-27T15:35:50.000Z" + "begin_timestamp": "2024-08-27T15:35:30.000Z", + "end_timestamp": "2024-08-27T15:35:50.000Z" } )Likely an incorrect or invalid review comment.
| sbin/compress.sh --timestamp-key ts CLODS/HDFS/HDFS-11896/json/hadoop-dn1__logs__hadoop.jsonl | ||
| ``` | ||
| """ | ||
| MCP_SERVER_ENDPOINT = "http://0.0.0.0:8000/mcp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider making the endpoint configurable via environment variable.
While the hardcoded endpoint is acceptable for manual tests, consider using an environment variable with this as a fallback (e.g., os.getenv("MCP_SERVER_ENDPOINT", "http://0.0.0.0:8000/mcp")). This would provide flexibility for different deployment configurations without modifying the code.
🤖 Prompt for AI Agents
integration-tests/tests/test_mcp_server.py around line 29: the MCP server
endpoint is hardcoded which reduces flexibility; change it to read from an
environment variable with the current value as default (use
os.getenv("MCP_SERVER_ENDPOINT", "http://0.0.0.0:8000/mcp")), add an import for
os at the top if missing, and ensure any tests or fixtures that reference
MCP_SERVER_ENDPOINT use this variable so the endpoint can be overridden in CI or
local runs.
| @pytest.mark.skip(reason="expected to run manually against a deployed CLP package") | ||
| @pytest.mark.asyncio | ||
| async def test_get_nth_page() -> None: | ||
| """Tests the get_nth_page tool.""" | ||
| async with Client(MCP_SERVER_ENDPOINT) as client: | ||
| await client.call_tool("get_instructions") | ||
|
|
||
| query = "*filter*" | ||
| await client.call_tool("search_kql_query", {"kql_query": query}) | ||
|
|
||
| result = await client.call_tool("get_nth_page", {"page_index": 1}) | ||
| assert "items" in result.data | ||
| assert isinstance(result.data["items"], list) | ||
| assert "num_total_items" in result.data | ||
| assert "num_total_pages" in result.data | ||
| assert "num_items_per_page" in result.data | ||
| assert "has_next" in result.data | ||
| assert "has_previous" in result.data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider validating pagination behaviour.
Beyond checking field presence, consider asserting that page_index 1 returns different results than page 0, or that has_previous is True when requesting page 1. This would more thoroughly validate the pagination functionality.
🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 60 to 77, the test
only checks presence/types of pagination fields but doesn't validate actual
pagination behavior; update the test to call get_nth_page for page_index 0 and 1
(or capture the prior page state), assert that the returned "items" lists differ
between pages (or that page 1 is not equal to page 0), and assert has_previous
is True for page_index 1 and has_next is False/True according to num_total_pages
(or at least that has_previous is True). Ensure you handle empty results by
first verifying num_total_pages > 1 (skip assertions if not) to avoid flaky
failures.
| @pytest.mark.asyncio | ||
| async def test_query_failure(self) -> None: | ||
| """Tests the search_kql_query_with_timestamp tool with a query that is killed.""" | ||
| async with Client(MCP_SERVER_ENDPOINT) as client: | ||
| await client.call_tool("get_instructions") | ||
|
|
||
| query = "*filter*" | ||
| result = await client.call_tool( | ||
| "search_kql_query_with_timestamp", | ||
| { | ||
| "kql_query": query, | ||
| # This test (ab)uses a design in CLP: when the begin timestamp is set before | ||
| # Unix epoch, the query will be killed. It works as of Oct. 22, 2025. The | ||
| # future versions of CLP may change this behavior. | ||
| "begin_timestamp": "1970-01-01T00:00:00.000Z", | ||
| "end_timestamp": "1970-01-02T00:00:00.000Z" | ||
| } | ||
| ) | ||
| assert "Error" in result.data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fragile test relies on undocumented behaviour.
The test_query_failure test relies on an undocumented implementation detail (queries being killed when begin timestamp is before Unix epoch) that is explicitly marked as unstable behaviour. While the comment acknowledges this, consider either:
- Using a more reliable way to trigger a query failure (e.g., invalid KQL syntax, if supported)
- Mocking the server response to simulate a failure
- Accepting that this test may need updates when CLP behaviour changes and documenting it clearly as a known fragility
The current approach creates technical debt and may cause unexpected test failures in future CLP versions.
🤖 Prompt for AI Agents
integration-tests/tests/test_mcp_server.py around lines 124 to 142: the test
currently relies on an unstable CLP behavior (killing queries with begin
timestamp before Unix epoch); replace this with a reliable failure trigger by
using an explicitly invalid KQL query string (e.g., malformed syntax) or by
mocking the client/server response to simulate an error; update the test to call
search_kql_query_with_timestamp with the invalid query (or a mocked failure
response) and assert that the result contains "Error", and update the inline
comment to explain the chosen reliable approach and why it replaces the fragile
timestamp-based trigger.
| @pytest.mark.skip(reason="expected to run manually against a deployed CLP package") | ||
| @pytest.mark.asyncio | ||
| async def test_log_link() -> None: | ||
| """Tests that the search results contain log links.""" | ||
| async with Client(MCP_SERVER_ENDPOINT) as client: | ||
| await client.call_tool("get_instructions") | ||
|
|
||
| query = "*error*" | ||
| result = await client.call_tool("search_kql_query", {"kql_query": query}) | ||
|
|
||
| for item in result.data["items"]: | ||
| assert "link" in item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider asserting that items list is not empty.
If the search query returns no results, the test will pass trivially without validating the link field. Consider adding an assertion like assert len(result.data["items"]) > 0 before the loop to ensure the test actually validates the expected behaviour.
🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 145 to 156, the test
iterates over result.data["items"] and asserts each item contains a "link" but
does not verify the items list is non-empty; add an assertion that
result.data["items"] contains at least one element (e.g., assert
len(result.data["items"]) > 0 or assert result.data["items"]) immediately before
the for-loop so the test fails if the search returns no results and thus
actually validates the link presence.
a557901 to
c4f9444
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
integration-tests/tests/test_mcp_server.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: package-image
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (2)
integration-tests/tests/test_mcp_server.py (2)
85-105: Verify test dataset timestamp coverage.The test uses timestamps from 2025-08-27, which are in the past relative to the current date (October 2025). Ensure that the test dataset (HDFS-11896 logs) actually contains log entries within this timestamp range. If the logs are from a different time period, the test may pass trivially with zero results.
45-46: Document the purpose of get_instructions calls, or remove if unnecessary.The call at line 46 is part of a consistent pattern—all test functions call
get_instructionsbefore their actual test logic but discard the result. If this serves as server initialization or connectivity verification, add a comment explaining the purpose. Otherwise, consider removing these calls to simplify tests.
| async def test_get_instructions() -> None: | ||
| """Tests the get_instructions tool.""" | ||
| async with Client(MCP_SERVER_ENDPOINT) as client: | ||
| res = await client.call_tool("get_instructions") | ||
| assert isinstance(res.data, str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider validating instruction content.
The test only checks that res.data is a string. For improved robustness, consider also asserting that the string is non-empty and optionally contains expected keywords or structure (e.g., mentions of available tools or usage guidance).
Apply this diff to add basic content validation:
async with Client(MCP_SERVER_ENDPOINT) as client:
res = await client.call_tool("get_instructions")
assert isinstance(res.data, str)
+ assert len(res.data) > 0, "Instructions should not be empty"🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 34 to 38, the test
only asserts that res.data is a string; update it to also assert the string is
non-empty and contains expected guidance keywords to validate content. After
fetching res = await client.call_tool("get_instructions"), add assertions such
as assert res.data.strip() != "" and one or two keyword checks (e.g., assert
"tools" in res.data.lower() or assert "usage" in res.data.lower()) so the test
verifies meaningful instruction content.
f00224d to
8caa93a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
integration-tests/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
integration-tests/.python-version(1 hunks)integration-tests/pyproject.toml(2 hunks)integration-tests/tests/test_mcp_server.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: package-image
- GitHub Check: lint-check (ubuntu-24.04)
| query = "*filter*" | ||
| result = await client.call_tool( | ||
| "search_kql_query_with_timestamp", | ||
| { | ||
| "kql_query": query, | ||
| # This test (ab)uses a design in CLP: when the begin timestamp is set before | ||
| # Unix epoch, the query will be killed. It works as of Oct. 22, 2025. The | ||
| # future versions of CLP may change this behavior. | ||
| "begin_timestamp": "1970-01-01T00:00:00.000Z", | ||
| "end_timestamp": "1970-01-02T00:00:00.000Z", | ||
| }, | ||
| ) | ||
| assert "Error" in result.data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove reliance on undocumented timestamp kill-switch.
The failure case currently depends on CLP killing queries whose begin timestamp predates the Unix epoch. That behaviour is explicitly called out as unstable, so this test will start failing (or worse, silently pass) as soon as CLP adjusts its timestamp handling. Please switch to a deterministic trigger—e.g., issue a syntactically invalid KQL query or otherwise provoke a guaranteed server-side error that’s covered by public contract—so the test keeps signalling real regressions instead of depending on internal quirks.
🤖 Prompt for AI Agents
In integration-tests/tests/test_mcp_server.py around lines 129 to 141, the test
currently relies on CLP killing queries whose begin_timestamp predates the Unix
epoch (an unstable implementation detail); replace this fragile trigger with a
deterministic server-side error such as sending a syntactically invalid KQL
query (e.g., malformed syntax or an obviously invalid token) or another
documented error condition, remove or use valid timestamps instead, call the
tool with the invalid query payload, and keep the existing assertion that an
error is returned (assert "Error" in result.data) so the test fails reliably
when server behavior regresses.
421100d to
50bd0a7
Compare
50bd0a7 to
904b83a
Compare
Description
This PR adds integration tests for the MCP server.
Checklist
breaking change.
Validation performed
All the tests have been manually tested.
Summary by CodeRabbit
Tests
Chores