-
Notifications
You must be signed in to change notification settings - Fork 8k
Description
Problem Description
MCP (Model Context Protocol) is defined by https://modelcontextprotocol.io/ as an open-source standard for connecting AI applications to external systems.
"Using MCP, AI applications like Claude or ChatGPT can connect to data sources (e.g. local files, databases), tools (e.g. search engines, calculators) and workflows (e.g. specialized prompts)—enabling them to access key information and perform tasks."
This RFC addresses the design of a Zephyr-based implementation of an MCP Server library (https://modelcontextprotocol.io/specification/2025-06-18) and examples and instructions on how to enable the MCP Server services, like Tools. The goal is to create a modular and scalable architecture, that can be developed in phases starting with the must-have features defined by the specification and high-priority features needed by NXP Semiconductors.
The design aims to use available libraries, where possible - Zephyr HTTP lib, Zephyr JSON lib and in the future the MBed TLS lib. However, the Zephyr HTTP lib doesn't support asynchronous processing of multiple requests, which means that in the future it will be necessary to either update the HTTP lib or switch to a different implementation.
The MCP library would enable all devices capable of internet connectivity to become MCP servers with tool services and communicate with MCP Clients - Agentic AI models. This would open the door to new AI-driven IoT systems based on MCU devices. NXP Semiconductors is focused on both AI and (I)IoT and needs such an implementation to further expand our portfolio of solutions. Upstreaming directly into Zephyr would help with long-term maintenance, standardization and community support.
Proposed Change (Summary)
This RFC intends to use available Zephyr libraries without modifications. It aims to add a new library and related samples to the Zephyr repository.
If implemented, a new MCP Server library would be added with documentation and unit tests. The library would be completely device-agnostic, would use the existing HTTP lib and JSON lib. Related sample(s) would be added to the sample repository. The sample(s) would show how to use and configure the MCP server and how to create simple tools and register them in the server.
In later phases, the library and samples would be extended with new features. Some changes to the Zephyr HTTP lib might be needed in the future to support asynchronous processing of multiple requests, especially once SSE support is added.
Proposed Change (Detailed)
Development phases
The development would be done in phases. Phase 1 would include:
- HTTP streaming transport
- Basic Request-Response without SSE
- Tool support
- Mandatory parts of the specification
Later phases would include all or most of the features described by the specification, most importantly:
- MCP Authorization and other security (Highest priority)
- SSE (Less important until the HTTP lib is improved or replaced, otherwise a single long-running SSE-handled request will block the server until complete.)
- MCP Session handling
- Other non-tool services
Architecture design
The implementation would follow the software architecture laid out in this UML component diagram (since it describes a C application, I took some liberty and didn't follow UML rules exactly, where needed). The diagram also contains several notes with more detailed explanations of what the intended use of a given component is.
MCP Server Architecture exported.html
MCP Server architecture.drawio
Note: The diagram includes features not planned for phase 1 of development. These are there to provide a basic template for future development. They are grey-colored to clearly distinguish them from the rest of the architecture.
The design can be divided into 2 sections with layers:
User application
- Implements tool functionality
- Adds/removes tools to/from the MCP server
- Does whatever else a user might need
MCP server library
-
Integrates all of the MCP Server related processing divided into modular layers
MCP server core
- Implements the MCP specification
- Uses worker thread pools for processing requests and responses
- Manages registries for tools, execution tokens, clients
- Monitors its own health via a peridically triggered function that checks timestamps and statuses
Transport layer
- Implements the transport of data
- First phase will focus only on the HTTP transport enabled by the Zephyr HTTP lib
- Parses incoming JSON-RPCs and serializes outgoing responses and notifications into JSON
Modularity:
The design aims for high modularity. No pointers to HTTP contexts would be passed outside the transport layer. All communication between layers would be done through message queues passing pointers to data structures. Correct flow of data through the system would be handled via IDs.
ID types
- MCP-Session-Id - not included in first phase, once implemented will follow MCP specification
- Event-Id - not included in first phase, once implemented will follow MCP specification
- Request-Id - unique identifier for each request to track processing and correlate responses
- Client-Id - unique identifier for each client connecting to the MCP server for lifecycle tracking
- Execution-token - unique token generated for each tool execution request to manage authorization and tracking (first phase will be an unsecure preparation for a UUID-based system later, so that API changes aren't required)
By using the queues and IDs, the MCP core design is completely agnostic to what happens both in the User application and the Transport layer.
Queue types
- Request queue into MCP core - transport layer uses it to pass parsed JSON-RPCs and system messages to the MCP core
- Message queue into MCP core - user application uses it to pass tool messages meant as responses or notifications for processing and transport to clients
- Response dispatch queue into transport layer - each request gets it's own queue, tracked through request IDs, which allows the HTTP request handler to wait on the queue while the RPC is asynchronously handled. Unfortunately, when using the current HTTP lib, this asynchronicity won't be made use of but in the future, if the HTTP lib is changed or modified, the system will be ready without further modifications.
Scalability:
The library would use configurable amounts of worker threads, configurably large message queues, configurable number of max concurrent clients and max streams per client, which would also configure the max amount of request dispatch queues.
Application design
Application Flow description in phase 1:
- User application initializes the MCP Server, registers tool callbacks - > MCP Core initializes the transport layer
- User application starts the MCP Server -> MCP Core starts the HTTP server
- Agentic AI Client sends a request to the HTTP Server
- HTTP Server validates and strips the headers, calls the Request callback
- Request callback parses and checks the validity of the JSON-RPC
- Request callback creates a new response dispatch queue, adds it and the request context to the registry and gets a new request ID and client ID, if a new client is detected
- Request callback submits the parsed RPC (by pointer) and request ID to the request processing queue in the MCP core and ends without submitting a response. If the full request is received from the client, the handler blocks on its queue until it receives a response.
- Request processing thread pulls out the new request from the request queue and delegates processing to an available worker from the request worker pool
- A request worker handles protocol actions for a given client ID and queues a response/error to the message queue or looks up a corresponding tool callback, creates an execution token and executes the callback (passing it the parsed RPC (by copy) and the execution token)
- Tool callback is either excuted directly by the request worker or starts a new thread and releases the worker back into its pool
- Once tool callback is finished, it submits a response message to the message queue and ends
- Message processing thread pulls out the new message from the queue, submits it for processing to an available worker thread
- Message worker thread processes (if needed) and submits the response into the dispatch queue (by pointer - make a copy before to ensure tool can't change contents after submitting response?)
- Request callback in the HTTP server finally sees a new response in its queue, serializes it, fills in the response structure and lets the HTTP server send the response to the client and end the stream
User application expectations in phase 1:
- User includes the MCP Server library in their Zephyr application
- User implements a callback and registers it to the MCP Server
- blocking: for short-running callbacks, they can be left to the MCP Server worker pool for execution and the implementation can be a simple sequential algorithm
- async: the user application can have a queue with a thread/workers that wait for incoming requests. The callback would put the new request into the queue and then return, allowing the worker thread from MCP server to be reused.
- What the application and tool does is completely decoupled from the MCP Server but should follow rules and best-practices defined in the Zephyr MCP Server documentation
- The user application should have a cancellation function capable of cancelling tool execution based on an execution token
Testing
The implementation would include unit and integration tests to ensure PRs go as smooth as possible.
Documentation
Full API documentation with descriptions of all configurable features, explanations of how the internal mechanisms work and suggestions on how to create examples using this library will be provided.
Dependencies
This change won't affect anything or anyone else, as it doesn't aim to make changes to existing code in the first development phases. It might require modifications to the HTTP library, which would be done through unrelated tickets/projects.
Concerns and Unresolved Questions
- Security is largely undefined and will be the focus for follow-up phases. MCP authorization, authentication of clients, UUID-based execution tokens to make it harder for malicious tools to respond to incorrect requests, and other considerations are planned.
- HTTP lib is limited and doesn't offer proper asynchronicity, which will have to be addressed in the future.
Alternatives Considered
- We did not find any C-based MCP server alternative mature enough
- We did not find any C-based MCP server alternative that would have a trustworthy long-term maintenance roadmap
- There is no Zephyr-based alternative
Metadata
Metadata
Assignees
Labels
Type
Projects
Status