feat: wire MCP server into CLI as `evalhub mcp` subcommand by tarilabs · Pull Request #99 · eval-hub/eval-hub-sdk

tarilabs · 2026-03-27T21:17:28Z

What and why

Add evalhub mcp subcommand reusing CLI profile/config for base-url, token, tenant
Remove standalone evalhub-mcp entry point and src/evalhub/mcp/__main__.py
MCP extra now depends on CLI extra (eval-hub-sdk[cli])
Lazy-import mcp package; show install hint when missing
Add tests for subcommand registration, config resolution, and CLI flag overrides

Type

Testing

Tests added or updated
Tested manually

see thread for manual tests

Breaking changes

Summary by CodeRabbit

New Features
- Added an mcp subcommand to the main CLI to start the MCP server using centralized configuration.
Bug Fixes & Improvements
- Consolidated MCP installation under the eval-hub-sdk optional extra and removed the separate console script.
Tests
- Added unit tests covering the MCP subcommand’s help, install guidance, and configuration precedence.

- Add `evalhub mcp` subcommand reusing CLI profile/config for base-url, token, tenant - Remove standalone `evalhub-mcp` entry point and `src/evalhub/mcp/__main__.py` - MCP extra now depends on CLI extra (`eval-hub-sdk[cli]`) - Lazy-import `mcp` package; show install hint when missing - Add tests for subcommand registration, config resolution, and CLI flag overrides Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: tarilabs <matteo.mortari@gmail.com>

coderabbitai · 2026-03-27T21:17:43Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9f801452-7fb7-4ace-ba94-a8cad66943ca

📥 Commits

Reviewing files that changed from the base of the PR and between 21c74ad and dd7af69.

📒 Files selected for processing (1)

tests/unit/test_cli_mcp.py

✅ Files skipped from review due to trivial changes (1)

tests/unit/test_cli_mcp.py

📝 Walkthrough

Walkthrough

Consolidates the EvalHub MCP server entry from a standalone evalhub-mcp module/console script into a new mcp subcommand on the main evalhub CLI, updates the mcp optional dependency to eval-hub-sdk[cli], removes the legacy __main__.py, and adds unit tests for the new CLI behavior.

Changes

Cohort / File(s)	Summary
Package Configuration `pyproject.toml`	Replaced `eval-hub-sdk[core]` and explicit `click>=8.1.0` in the `mcp` optional deps with `eval-hub-sdk[cli]`. Removed the `evalhub-mcp` console script entry.
CLI Implementation `src/evalhub/cli/main.py`	Added `mcp` subcommand to start MCP server over stdio, resolves config/profile values (`base_url`, `token`, `tenant`, `insecure`, `timeout`), dynamically imports `mcp` package and surfaces install guidance on missing dependency, constructs `AsyncEvalHubClient`, registers client, and runs server.
Legacy Entry Point Removal `src/evalhub/mcp/__main__.py`	Removed the previous Click-based `__main__` CLI entry (tenant/base-url/token handling and `oc whoami -t` fallback) and its helper for auth token resolution.
Tests `tests/unit/test_cli_mcp.py`	Added unit tests for `evalhub mcp`: help text presence, missing `mcp` package error message with pip guidance, profile-based config resolution, and CLI-flag override behavior (patching `asyncio.run`, client, and server).

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as evalhub CLI
    participant Config as Config/Profile
    participant MCP as mcp package
    participant Client as AsyncEvalHubClient
    participant Server as MCP Server

    User->>CLI: evalhub mcp [--base-url] [--token] [--tenant]
    CLI->>Config: Load active profile (EVALHUB_CONFIG/EVALHUB_TENANT)
    Config-->>CLI: Profile values (base_url, token, tenant, insecure, timeout)
    CLI->>CLI: Resolve final params (CLI flags override profile)
    CLI->>MCP: import mcp
    alt mcp missing
        MCP-->>CLI: ImportError
        CLI-->>User: ClickException with pip install guidance
    else mcp present
        MCP-->>CLI: mcp loaded
        CLI->>Client: AsyncEvalHubClient(base_url, token, tenant, insecure, timeout)
        Client-->>CLI: client instance
        CLI->>Server: set_client(client)
        CLI->>Server: asyncio.run(mcp_server.run_stdio_async())
        Server-->>User: MCP server running on stdio
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: add MCP server as an Extra, using stdio #92: Modifies MCP packaging and CLI entry surface; touches the same mcp extra and CLI/entrypoint concerns.

Suggested reviewers

mariusdanciu
ruivieira

Poem

🐰 I hopped from script to subcommand bright,
Config carrots shine in profile light,
I nibble flags that override the rest,
Now one neat CLI runs the MCP quest. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: wiring the MCP server into the CLI as a subcommand named 'mcp', which aligns with the changeset's primary objective.
Description check	✅ Passed	The description covers all required template sections: what/why with concrete details, type checkboxes appropriately marked (feat and refactor/chore), testing confirmation, and breaking changes addressed (none).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/unit/test_cli_mcp.py (1)

16-22: Environment variable cleanup is not isolated.

The fixture directly modifies os.environ, which could cause issues if tests are run in parallel or if an exception occurs before cleanup. Consider using CliRunner(env=...) or monkeypatch for safer isolation.

♻️ Safer alternative using monkeypatch

 `@pytest.fixture`()
-def config_file(tmp_path: Path) -> Iterator[Path]:
+def config_file(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> Iterator[Path]:
     """Provide a temporary config file path and set EVALHUB_CONFIG."""
     path = tmp_path / "config.yaml"
-    os.environ["EVALHUB_CONFIG"] = str(path)
+    monkeypatch.setenv("EVALHUB_CONFIG", str(path))
     yield path
-    os.environ.pop("EVALHUB_CONFIG", None)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_cli_mcp.py` around lines 16 - 22, The fixture config_file
currently mutates os.environ directly which isn't isolated; update it to use
pytest's monkeypatch (add monkeypatch to the fixture signature) and call
monkeypatch.setenv("EVALHUB_CONFIG", str(path)) before yield so the environment
is automatically restored after each test, or alternatively when invoking the
CLI use Click's CliRunner(env={...}) for test-specific envs; keep the fixture
name config_file and the tmp_path usage but remove direct os.environ
modification and the manual os.environ.pop cleanup.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/evalhub/cli/main.py`:
- Around line 902-909: The created AsyncEvalHubClient is stored via
set_client(client) but never closed; either construct it with an async context
manager (use "async with AsyncEvalHubClient(...)" and call set_client inside
that block so the client is auto-closed when mcp_server.run_stdio_async()
returns) or ensure explicit shutdown by awaiting its async close method after
the server run (call await client.aclose() or asyncio.run(client.aclose()) in a
finally/shutdown hook). Update the code around AsyncEvalHubClient creation and
the call to mcp_server.run_stdio_async() to use one of these approaches
(referencing AsyncEvalHubClient, set_client, mcp_server.run_stdio_async, and the
client.aclose()/__aenter__/__aexit__ methods) so the underlying httpx client is
always cleaned up on exit.

In `@tests/unit/test_cli_mcp.py`:
- Around line 99-105: Add the missing assertion that verifies set_client was
invoked in the test_mcp_cli_flags_override_profile test: after the existing
mocks/assertions (including mock_client_cls.assert_called_once_with(...)) add a
call to assert mock_set_client.assert_called_once() to mirror
test_mcp_resolves_from_profile; locate the mock named mock_set_client in the
test function and assert it was called exactly once to ensure the client was set
during the CLI flags override scenario.

---

Nitpick comments:
In `@tests/unit/test_cli_mcp.py`:
- Around line 16-22: The fixture config_file currently mutates os.environ
directly which isn't isolated; update it to use pytest's monkeypatch (add
monkeypatch to the fixture signature) and call
monkeypatch.setenv("EVALHUB_CONFIG", str(path)) before yield so the environment
is automatically restored after each test, or alternatively when invoking the
CLI use Click's CliRunner(env={...}) for test-specific envs; keep the fixture
name config_file and the tmp_path usage but remove direct os.environ
modification and the manual os.environ.pop cleanup.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6304f8a8-dca9-40d7-9547-2e62fdd6aaf7

📥 Commits

Reviewing files that changed from the base of the PR and between 6539586 and 21c74ad.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (4)

pyproject.toml
src/evalhub/cli/main.py
src/evalhub/mcp/__main__.py
tests/unit/test_cli_mcp.py

💤 Files with no reviewable changes (1)

src/evalhub/mcp/main.py

src/evalhub/cli/main.py

tests/unit/test_cli_mcp.py

tarilabs · 2026-03-27T21:23:05Z

LIVE DEMO that refactor (wire mcp into cli) works as intended

ensure authenticated with

oc whoami

for completeness:

# IFF error: You must be logged in to the server (Unauthorized)
oc login ...

ensure evalhub command is available:

evalhub  
Usage: evalhub [OPTIONS] COMMAND [ARGS]...

  EvalHub CLI - manage evaluations, providers, collections, and configuration.

Options:
  --version        Show the version and exit.
  --profile TEXT   Configuration profile to use (overrides active profile).
  --base-url TEXT  EvalHub server URL (overrides profile config).
  --token TEXT     Authentication token (overrides profile config).
  -v, --verbose    Enable verbose output (show SDK logs).
  -h, --help       Show this message and exit.

Commands:
  collections  Browse and manage benchmark collections.
  config       View and update CLI configuration.
  eval         Submit and manage evaluation jobs.
  health       Check health of the EvalHub service.
  mcp          Start the EvalHub MCP server (stdio transport).
  providers    List and inspect evaluation providers.
  version      Print version and build info.

you can notice mcp is a subcommand.

export EVALHUB_TOKEN=$(oc whoami -t)
npx @modelcontextprotocol/inspector

using:

command:
evalhub

arguments:
--base-url https://evalhub-opendatahub.apps.rosa.(REDACTED).openshiftapps.com mcp --tenant team-a

works as intended

if missing the auth token gets 401 in MCP inspector and on the server:

{"level":"error","ts":"2026-03-27T21:11:21.465Z","caller":"server/authentication.go:40","msg":"Request not authenticated","path":"/api/v1/evaluations/providers","method":"GET","stacktrace":"github.com/eval-hub/eval-hub/internal/eval_hub/server.WithAuthentication.func1\n\t/build/internal/eval_hub/server/authentication.go:40\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/lib/golang/src/net/http/server.go:2322\nnet/http.serverHandler.ServeHTTP\n\t/usr/lib/golang/src/net/http/server.go:3340\nnet/http.(*conn).serve\n\t/usr/lib/golang/src/net/http/server.go:2109"}

otherwise works as intended:

{"level":"info","ts":"2026-03-27T21:11:57.947Z","caller":"server/authentication.go:23","msg":"Authenticating request","path":"/api/v1/evaluations/providers","method":"GET"}
{"level":"info","ts":"2026-03-27T21:11:57.962Z","caller":"auth/authorization.go:60","msg":"Authorizing","record":{"User":{"Name":"mmortari","UID":"e375f99f-9620-4cf1-9c18-d6054f0ffa3f","Groups":["cluster-admins","system:authenticated:oauth","system:authenticated"],"Extra":{"scopes.authorization.openshift.io":["user:full"]}},"Verb":"get","Namespace":"team-a","APIGroup":"trustyai.opendatahub.io","APIVersion":"","Resource":"providers","Subresource":"","Name":"","ResourceRequest":true,"Path":"","FieldSelectorRequirements":null,"FieldSelectorParsingErr":null,"LabelSelectorRequirements":null,"LabelSelectorParsingErr":null}}
{"level":"info","ts":"2026-03-27T21:11:57.966Z","caller":"server/authorization.go:60","msg":"Request authorized","path":"/api/v1/evaluations/providers","method":"GET","user":"mmortari"}
{"level":"info","ts":"2026-03-27T21:11:57.966Z","caller":"handlers/providers.go:117","msg":"Request started","request_id":"e03ec4f9-bdae-463e-9412-b1f2f3603075","method":"GET","uri":"/api/v1/evaluations/providers","user_agent":"python-httpx/0.28.1","remote_addr":"10.128.6.31:47414","tenant":"team-a","user":"mmortari","filter":"{\"limit\":50,\"offset\":0,\"params\":map[name: owner: tags:]}"}
{"level":"info","ts":"2026-03-27T21:11:57.968Z","caller":"server/execution_context.go:158","msg":"Request successful","request_id":"e03ec4f9-bdae-463e-9412-b1f2f3603075","method":"GET","uri":"/api/v1/evaluations/providers","user_agent":"python-httpx/0.28.1","remote_addr":"10.128.6.31:47414","tenant":"team-a","user":"mmortari","code":200,"duration":0.002043061}

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: tarilabs <matteo.mortari@gmail.com>

mariusdanciu

lgtm

ruivieira · 2026-03-29T23:14:25Z

One thought (don't feel strongly about it, and I can't recall if this was discussed previously): right now the mcp extra depends on eval-hub-sdk[cli], which means installing MCP pulls in Click, Rich, and all the CLI machinery. In practice the MCP server only needs the client SDK + mcp library.

Would it make sense to flip this?

mcp extra stays standalone: just eval-hub-sdk[core] + mcp>=1.0.0
cli extra gains a dependency on eval-hub-sdk[mcp]

That way:

pip install eval-hub-sdk[mcp] gives you a lightweight MCP server usable as a library or via python -m evalhub.mcp (no CLI deps)
pip install eval-hub-sdk[cli] gives you everything including evalhub mcp

The evalhub mcp subcommand would still work exactly as-is; it just means the MCP server code doesn't drag in CLI dependencies when used standalone. A thin __main__.py could handle the python -m case with minimal config (env vars or a small arg parser).

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

src/evalhub/cli/main.py Show resolved Hide resolved

tests/unit/test_cli_mcp.py Show resolved Hide resolved

chore: impl code review feedback

dd7af69

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: tarilabs <matteo.mortari@gmail.com>

tarilabs requested review from gnaulak-redhat, julpayne, mariusdanciu, nbs-rh, ppadashe-psp, ruivieira and scheruku-rh March 27, 2026 21:29

mariusdanciu approved these changes Mar 28, 2026

View reviewed changes

ppadashe-psp approved these changes Mar 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wire MCP server into CLI as `evalhub mcp` subcommand#99

feat: wire MCP server into CLI as `evalhub mcp` subcommand#99
tarilabs wants to merge 2 commits intoeval-hub:mainfrom
tarilabs:tarilabs-20260327-mcpwithcli

tarilabs commented Mar 27, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

tarilabs commented Mar 27, 2026

Uh oh!

mariusdanciu left a comment

Uh oh!

ruivieira commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

tarilabs commented Mar 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What and why

Type

Testing

Breaking changes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tarilabs commented Mar 27, 2026

LIVE DEMO that refactor (wire mcp into cli) works as intended

Uh oh!

mariusdanciu left a comment

Choose a reason for hiding this comment

Uh oh!

ruivieira commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tarilabs commented Mar 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 27, 2026 •

edited

Loading