-
Notifications
You must be signed in to change notification settings - Fork 126
Add Virtual MCP Server proposal #2106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JAORMX
wants to merge
2
commits into
main
Choose a base branch
from
proposal/virtual-mcp-server
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+355
−0
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,355 @@ | ||
# Virtual MCP Server | ||
|
||
## Problem Statement | ||
|
||
Organizations need to consolidate multiple MCP servers into a unified interface for complex workflows spanning multiple tools and services. Currently, clients must manage connections to individual MCP servers separately, leading to complexity in orchestration and authentication management. | ||
|
||
## Goals | ||
|
||
- Aggregate multiple MCP servers from a ToolHive group into a single virtual MCP server | ||
- Enable tool namespace management and conflict resolution | ||
- Support composite tools for multi-step workflows across backends | ||
- Handle per-backend authentication/authorization requirements | ||
- Maintain full MCP protocol compatibility | ||
|
||
## Proposed Solution | ||
|
||
### High-Level Design | ||
|
||
The Virtual MCP Server (`thv virtual`) acts as an aggregation proxy that: | ||
1. References an existing ToolHive group containing multiple MCP server workloads | ||
2. Discovers and merges capabilities (tools, resources, prompts) from all workloads | ||
3. Routes incoming MCP requests to appropriate backend workloads | ||
4. Manages per-backend authentication requirements | ||
|
||
``` | ||
MCP Client → Virtual MCP → [Workload A, Workload B, ... from Group] | ||
``` | ||
|
||
### Configuration Schema | ||
|
||
```yaml | ||
# virtual-mcp-config.yaml | ||
version: v1 | ||
name: "engineering-tools" | ||
description: "Virtual MCP for engineering team" | ||
|
||
# Reference existing ToolHive group (required) | ||
group: "engineering-team" | ||
|
||
# Tool aggregation using existing ToolHive constructs | ||
aggregation: | ||
conflict_resolution: "prefix" # prefix | priority | manual | ||
|
||
tools: | ||
- workload: "github" | ||
# Filter to only include specific tools (uses existing ToolsFilter) | ||
filter: ["create_pr", "merge_pr", "list_issues"] | ||
# Override tool names/descriptions (uses existing ToolOverride) | ||
overrides: | ||
create_pr: | ||
name: "gh_create_pr" | ||
description: "Create a GitHub pull request" | ||
|
||
- workload: "jira" | ||
# If no filter specified, all tools are included | ||
overrides: | ||
create_issue: | ||
name: "jira_create_issue" | ||
|
||
- workload: "slack" | ||
# No filter or overrides - include all tools as-is | ||
|
||
# Per-backend authentication | ||
backend_auth: | ||
github: | ||
type: "token_exchange" | ||
token_url: "https://github.com/oauth/token" | ||
audience: "github-api" | ||
client_id_ref: | ||
name: "github-oauth" | ||
key: "client_id" | ||
|
||
jira: | ||
type: "pass_through" | ||
|
||
internal_db: | ||
type: "service_account" | ||
credentials_ref: | ||
name: "db-creds" | ||
key: "token" | ||
|
||
slack: | ||
type: "header_injection" | ||
headers: | ||
- name: "Authorization" | ||
value_ref: | ||
name: "slack-creds" | ||
key: "bot_token" | ||
|
||
# Composite tools (Phase 2) | ||
composite_tools: | ||
- name: "deploy_and_notify" | ||
description: "Deploy PR with user confirmation and notification" | ||
parameters: | ||
pr_number: {type: "integer"} | ||
steps: | ||
- id: "merge" | ||
tool: "github.merge_pr" | ||
arguments: {pr: "{{.params.pr_number}}"} | ||
|
||
# Elicitation step for user confirmation | ||
- id: "confirm_deploy" | ||
type: "elicitation" | ||
message: "PR {{.params.pr_number}} merged successfully. Proceed with deployment?" | ||
schema: | ||
type: "object" | ||
properties: | ||
environment: | ||
type: "string" | ||
enum: ["staging", "production"] | ||
enumNames: ["Staging", "Production"] | ||
description: "Target deployment environment" | ||
notify_team: | ||
type: "boolean" | ||
default: true | ||
description: "Send Slack notification to team" | ||
required: ["environment"] | ||
depends_on: ["merge"] | ||
on_decline: | ||
action: "skip_remaining" | ||
on_cancel: | ||
action: "abort" | ||
|
||
- id: "deploy" | ||
tool: "kubernetes.deploy" | ||
arguments: | ||
pr: "{{.params.pr_number}}" | ||
environment: "{{.steps.confirm_deploy.content.environment}}" | ||
depends_on: ["confirm_deploy"] | ||
condition: "{{.steps.confirm_deploy.action == 'accept'}}" | ||
|
||
- id: "notify" | ||
tool: "slack.send" | ||
arguments: | ||
text: "Deployed PR {{.params.pr_number}} to {{.steps.confirm_deploy.content.environment}}" | ||
depends_on: ["deploy"] | ||
condition: "{{.steps.confirm_deploy.content.notify_team}}" | ||
``` | ||
|
||
#### Example: Incident Investigation Using Multiple Fetch Calls | ||
|
||
This composite tool demonstrates calling the same backend tool (`fetch`) multiple times with different static URLs to gather incident data from various monitoring sources: | ||
|
||
```yaml | ||
composite_tools: | ||
- name: "investigate_incident" | ||
description: "Gather logs and metrics from multiple sources for incident analysis" | ||
parameters: | ||
incident_id: {type: "string"} | ||
time_range: {type: "string", default: "1h"} | ||
steps: | ||
- id: "fetch_app_logs" | ||
tool: "fetch.fetch" | ||
arguments: | ||
url: "https://logs.company.com/api/query?service=app&time={{.params.time_range}}" | ||
|
||
- id: "fetch_error_metrics" | ||
tool: "fetch.fetch" | ||
arguments: | ||
url: "https://metrics.company.com/api/errors?time={{.params.time_range}}" | ||
|
||
- id: "fetch_trace_data" | ||
tool: "fetch.fetch" | ||
arguments: | ||
url: "https://tracing.company.com/api/traces?time={{.params.time_range}}" | ||
|
||
- id: "fetch_infra_status" | ||
tool: "fetch.fetch" | ||
arguments: | ||
url: "https://monitoring.company.com/api/infrastructure/status" | ||
|
||
- id: "create_report" | ||
tool: "jira.create_issue" | ||
arguments: | ||
title: "Incident {{.params.incident_id}} Analysis" | ||
description: | | ||
Application Logs: {{.steps.fetch_app_logs.output}} | ||
Error Metrics: {{.steps.fetch_error_metrics.output}} | ||
Trace Data: {{.steps.fetch_trace_data.output}} | ||
Infrastructure: {{.steps.fetch_infra_status.output}} | ||
``` | ||
|
||
This pattern allows a single generic `fetch` tool to be reused with different static endpoints, creating a purpose-built incident investigation workflow without needing separate tools for each data source. | ||
|
||
```yaml | ||
# Operational settings | ||
operational: | ||
timeouts: | ||
default: 30s | ||
per_workload: | ||
github: 45s | ||
|
||
failover: | ||
strategy: "automatic" | ||
``` | ||
|
||
### Key Features | ||
|
||
#### 1. Group-Based Backend Management | ||
|
||
Virtual MCP references a ToolHive group and automatically discovers all workloads within it: | ||
- Leverages existing `groups.Manager` and `workloads.Manager` | ||
- Dynamic workload discovery via `ListWorkloadsInGroup()` | ||
- Inherits workload configurations from group | ||
|
||
#### 2. Capability Aggregation | ||
|
||
Merges MCP capabilities from all backends: | ||
- **Tools**: Aggregated with namespace conflict resolution | ||
- **Resources**: Combined from all backends | ||
- **Prompts**: Merged with prefixing | ||
- **Logging/Sampling**: Enabled if any backend supports it | ||
|
||
#### 3. Tool Filtering and Overrides | ||
|
||
Uses existing ToolHive constructs: | ||
- **ToolsFilter**: Include only specific tools from a workload | ||
- **ToolOverride**: Rename tools and update descriptions to avoid conflicts | ||
|
||
#### 4. Per-Backend Authentication | ||
|
||
Different backends may require different authentication strategies: | ||
- **pass_through**: Forward client credentials unchanged | ||
- **token_exchange**: Exchange incoming token for backend-specific token (RFC 8693) | ||
- **service_account**: Use stored credentials for backend | ||
- **header_injection**: Add authentication headers from secrets | ||
- **mapped_claims**: Transform JWT claims for backend requirements | ||
|
||
Example authentication flow: | ||
``` | ||
Client → (Bearer token for Virtual MCP) → Virtual MCP | ||
├→ GitHub backend (token exchanged for GitHub PAT) | ||
├→ Jira backend (original token passed through) | ||
└→ Internal DB (service account token injected) | ||
``` | ||
|
||
#### 5. Request Routing | ||
|
||
Routes MCP protocol requests to appropriate backends: | ||
- Tool calls routed based on tool-to-workload mapping | ||
- Resource requests routed by resource namespace | ||
- Prompt requests handled similarly | ||
- Load balancing for duplicate capabilities | ||
|
||
#### 6. Elicitation Support in Composite Tools | ||
|
||
Composite tools can include elicitation steps to request additional information from users during workflow execution, following the [MCP elicitation specification](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation). This enables: | ||
|
||
- **Interactive workflows**: Request user input between steps | ||
- **Safety confirmations**: Require approval before destructive operations | ||
- **Dynamic parameters**: Gather context-dependent information | ||
- **Conditional execution**: Branch workflow based on user responses | ||
|
||
Elicitation steps use JSON Schema to define the structure of requested data (limited to flat objects with primitive properties). The Virtual MCP server forwards elicitation requests to the client and captures responses in three forms: | ||
|
||
- **accept**: User provided data (accessible via `{{.steps.step_id.content}}`) | ||
- **decline**: User explicitly rejected the request | ||
- **cancel**: User dismissed without choosing | ||
|
||
Subsequent steps can reference elicitation results through template expansion and use the `condition` field to execute conditionally based on user responses. The `on_decline` and `on_cancel` handlers control workflow behavior for non-acceptance scenarios. | ||
|
||
### CLI Usage | ||
|
||
```bash | ||
# Start virtual MCP for a group | ||
thv virtual --group engineering-team --config virtual-config.yaml | ||
|
||
# Quick start with defaults | ||
thv virtual --group engineering-team | ||
|
||
# With inline tool filtering | ||
thv virtual --group engineering-team \ | ||
--tools github:create_pr,merge_pr \ | ||
--tools jira:create_issue | ||
|
||
# With tool overrides file | ||
thv virtual --group engineering-team \ | ||
--tools-override overrides.yaml | ||
``` | ||
|
||
### Implementation Phases | ||
|
||
**Phase 1 (MVP)**: Basic aggregation | ||
- Group-based workload discovery | ||
- Tool aggregation with filter and override support | ||
- Simple request routing | ||
- Pass-through authentication | ||
|
||
**Phase 2**: Advanced features | ||
- Composite tool execution | ||
- Elicitation support in composite tools | ||
- Per-backend authentication strategies | ||
- Token exchange support | ||
- Failover and load balancing | ||
|
||
**Phase 3**: Enterprise features | ||
- Dynamic configuration updates | ||
- Multi-group support | ||
- Advanced routing strategies | ||
- Comprehensive observability | ||
|
||
## Benefits | ||
|
||
- **Simplified Client Integration**: Single MCP connection instead of multiple | ||
- **Unified Authentication**: Handle diverse backend auth requirements centrally | ||
- **Workflow Orchestration**: Composite tools enable cross-service workflows | ||
- **Operational Efficiency**: Centralized monitoring and management | ||
- **Backward Compatible**: Works with existing MCP clients and servers | ||
|
||
## Implementation Notes | ||
|
||
### Reusing Existing Components | ||
|
||
The implementation will maximize reuse of existing ToolHive components: | ||
|
||
1. **Tool Filtering**: Use existing `mcp.WithToolsFilter()` middleware | ||
2. **Tool Overrides**: Use existing `mcp.WithToolsOverride()` middleware | ||
3. **Groups**: Use `groups.Manager` for group operations | ||
4. **Workloads**: Use `workloads.Manager` for workload discovery | ||
5. **Authentication**: Extend existing auth middleware patterns | ||
6. **Elicitation**: Implement MCP elicitation protocol for composite tool user interaction | ||
|
||
### Authentication Complexity | ||
|
||
Different MCP servers may have vastly different authentication requirements: | ||
- Public servers may need no auth | ||
- Enterprise servers may require OAuth/OIDC | ||
- Internal services may use service accounts | ||
- Third-party APIs may need API keys | ||
|
||
The Virtual MCP must handle this complexity transparently, maintaining separate authentication contexts per backend while presenting a unified interface to clients. | ||
|
||
### Composite Tool State Persistence | ||
|
||
Composite tools with elicitation require state persistence to handle long-running workflows that span multiple user interactions. The Virtual MCP will maintain workflow execution state (current step, completed steps, elicitation responses, and template variables) in memory with a clean storage interface to enable future migration to persistent backends. Each composite tool invocation receives a unique workflow ID, and the server checkpoints state after each step completion and elicitation interaction. Workflows have a default timeout of 30 minutes, after which they are automatically cleaned up and any pending elicitations are treated as cancelled. This approach keeps the implementation simple for Phase 2 while establishing patterns that can scale to distributed persistence later. | ||
|
||
## Alternative Approaches Considered | ||
|
||
1. **Kubernetes CRD**: More complex, requires operator changes | ||
2. **Standalone Service**: Loses integration with ToolHive infrastructure | ||
3. **LLM-generated backend**: Adds more randomness to the equation which is undesirable for agents. | ||
JAORMX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Open Questions | ||
|
||
1. How to handle streaming responses across multiple backends? | ||
2. Should we cache backend capabilities or query dynamically? | ||
3. How to handle backend-specific rate limits? | ||
JAORMX marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Success Criteria | ||
|
||
- Aggregate 5+ MCP servers from a group with < 10ms routing overhead | ||
- Support 100+ concurrent client connections | ||
- Zero changes required to existing MCP servers or clients | ||
- Successfully handle different authentication methods per backend | ||
- Maintain existing tool filter and override semantics |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.