-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Context
We recently hit issues caused by fastmcp lagging behind the latest official MCP SDK/protocol changes. Because fastmcp sits in our critical path (via @nuwa-ai/payment-kit -> FastMcpStarter and services like nuwa-services/mcp-server-proxy), SDK/protocol drift can quickly turn into hard-to-debug runtime incompatibilities.
Today we depend on:
fastmcpfor server/session/tool plumbing (FastMCP,FastMCPSession)mcp-proxyfor the HTTP/SSE endpoint (startHTTPServer)@modelcontextprotocol/sdkfor types/compat, but not as the server runtime
This issue proposes a path to reduce risk by using the official @modelcontextprotocol/sdk as the server implementation, while keeping our Nuwa-specific payment/auth/tool abstractions intact.
Goal
- Make the official
@modelcontextprotocol/sdkthe primary MCP server runtime for production services. - Keep
McpPaymentKit(billing/auth/settlement) and our tool registration API stable. - Provide a low-risk dual-engine period (
fastmcp+ official SDK) with an easy rollback switch.
Non-goals
- Rewriting business tools.
- Changing Nuwa payment/auth protocol semantics.
- Broad refactors outside MCP server runtime.
Proposed Design
1) Dual-engine server entrypoint
Add a unified server factory in @nuwa-ai/payment-kit:
createMcpServer({ engine: 'sdk' | 'fastmcp', ...opts })- Default remains
fastmcpinitially. engineis configurable via env for services (e.g.,MCP_ENGINE=sdk|fastmcp).
- Default remains
Keep existing createFastMcpServer* APIs for backward compatibility during rollout.
2) Official SDK engine (Streamable HTTP)
Implement a new starter (e.g., SdkMcpStarter) using:
@modelcontextprotocol/sdk/server/mcp(McpServer)@modelcontextprotocol/sdk/server/streamableHttp(StreamableHTTPServerTransport)
Because StreamableHTTPServerTransport is per-session, we implement a lightweight session router:
POST /mcp(initialize): create a new transport withsessionIdGenerator, create a newMcpServer, register tools/prompts/resources, connect transport, then delegate request handling totransport.handleRequest(...).- Subsequent
GET/POST/DELETE /mcp: route bymcp-session-idheader to the correct transport and callhandleRequest(...). - Maintain
Map<sessionId, sessionState>with timestamps for/readyand session GC.
3) Keep Nuwa reserved params and return format
- Preserve support for
__nuwa_authand__nuwa_paymentin tool args. - Continue returning MCP-compatible
CallToolResultshape:{ content: [...] }.
4) Parity endpoints
Keep current extra endpoints/behavior:
GET /healthGET /ready(based on initialized sessions)GET /.well-known/nuwa-payment/infoOPTIONSpreflightcustomRouteHandlerhook (before MCP handling)
Phased Rollout Plan
Phase 0: Baseline
- Document current dependencies and pain points.
- Define parity checklist (below).
Phase 1: Introduce dual-engine switch (no behavior change)
- Add
createMcpServer({ engine })and wireengine=fastmcpto existingFastMcpStarter. - Add minimal contract tests that can be reused by both engines.
Phase 2: Implement official SDK engine
- Implement
SdkMcpStarterwith session router + Streamable HTTP. - Reuse existing tool registration and billing integration.
- Run existing MCP E2E tests against both engines (fastmcp remains default in CI; sdk can be optional/nightly initially).
Phase 3: Service-level canary
- Add
MCP_ENGINEtonuwa-services/mcp-server-proxy(and other MCP services if needed). - Canary sdk engine in staging / a small production slice.
- Measure: init success rate, tool error rate, latency, session counts, memory.
Phase 4: Flip default and deprecate fastmcp
- Default
MCP_ENGINE=sdk. - Keep rollback to
fastmcpfor 1–2 releases. - Remove
fastmcpdependency once stable (including jest transform ignores, docs, etc.).
Acceptance Criteria (Parity Checklist)
- Tool registration works (free + paid) and billing settlement behavior matches current.
__nuwa_authand__nuwa_paymentare accepted and processed./mcpsupports Streamable HTTP (SSE) and session management works correctly./health,/ready, and/.well-known/nuwa-payment/infobehave as before.- Existing
payment-kitMCP E2E tests pass with both engines (at least locally; CI strategy TBD). - Rollback is a config change only (
MCP_ENGINE=fastmcp).
Risks & Mitigations
- Session lifecycle/memory leaks: implement session TTL + GC + server shutdown cleanup.
- Behavior differences in schema validation: keep permissive schema defaults and extend Zod with Nuwa reserved fields.
- Breaking changes for consumers: keep old APIs and make the switch opt-in until canary proves stability.
Open Questions
- Do we want
sdkengine to be the default for all services at once or start withmcp-server-proxyonly? - What is the required support window for
fastmcpas fallback (1 release vs 2)? - CI strategy: run sdk engine tests always, or nightly/optional until stable?
References (repo locations)
nuwa-kit/typescript/packages/payment-kit/src/transport/mcp/FastMcpStarter.tsnuwa-services/mcp-server-proxy/src/server.ts
If we agree on this direction, I can follow up with a PR implementing Phase 1 + Phase 2 skeleton (dual-engine switch + initial SdkMcpStarter) and a minimal parity test harness.