|
| 1 | +--- |
| 2 | +title: Sampling |
| 3 | +type: docs |
| 4 | +weight: 7 |
| 5 | +description: "MCP protocol specification for language model sampling and text generation" |
| 6 | +draft: false |
| 7 | +params: |
| 8 | + author: Anthropic |
| 9 | +keywords: ["mcp", "sampling", "llm", "protocols"] |
| 10 | +--- |
| 11 | + |
| 12 | +Sampling enables servers to request generations from a language model via the client, enabling clients to have control over which model to use and what prompts are accepted. Clients can approve or reject incoming sampling requests, and control permissions around which servers can access which models. Servers can optionally request context from other MCP servers to be included in prompts. Because sampling requests go from server to client, this is the only request type in MCP that flows in this direction. |
| 13 | + |
| 14 | +## Capabilities |
| 15 | + |
| 16 | +Clients indicate support for sampling by including a `sampling` capability in their `ClientCapabilities` during initialization. The `sampling` capability SHOULD be an empty object: |
| 17 | + |
| 18 | +```json |
| 19 | +{ |
| 20 | + "capabilities": { |
| 21 | + "sampling": {} |
| 22 | + } |
| 23 | +} |
| 24 | +``` |
| 25 | + |
| 26 | +Servers SHOULD check for this capability before attempting to use sampling functionality. |
| 27 | + |
| 28 | +## Concepts |
| 29 | + |
| 30 | +### Sampling Request |
| 31 | + |
| 32 | +A Sampling Request in the Model Context Protocol (MCP) represents a request from a server to generate text from a language model via the client. Each request contains messages to send to the model, optional system prompts, and sampling parameters like temperature and maximum tokens. The client has full discretion over which model to use and whether to approve the request. |
| 33 | + |
| 34 | +### Message Content |
| 35 | + |
| 36 | +Message content can be either text or images, allowing for multimodal interactions where supported by the model. Text content is provided directly as strings, while image content must be base64 encoded with an appropriate MIME type. |
| 37 | + |
| 38 | +## Use Cases |
| 39 | + |
| 40 | +Common use cases for sampling include generating responses in chat interfaces, code completion, and content generation. Here are some example sampling scenarios: |
| 41 | + |
| 42 | +### Chat Response |
| 43 | + |
| 44 | +A server requesting a chat response: |
| 45 | + |
| 46 | +```json |
| 47 | +{ |
| 48 | + "messages": [ |
| 49 | + { |
| 50 | + "role": "user", |
| 51 | + "content": { |
| 52 | + "type": "text", |
| 53 | + "text": "What is the capital of France?" |
| 54 | + } |
| 55 | + } |
| 56 | + ], |
| 57 | + "maxTokens": 100, |
| 58 | + "temperature": 0.7 |
| 59 | +} |
| 60 | +``` |
| 61 | + |
| 62 | +### Image Analysis |
| 63 | + |
| 64 | +A server requesting analysis of an image: |
| 65 | + |
| 66 | +```json |
| 67 | +{ |
| 68 | + "messages": [ |
| 69 | + { |
| 70 | + "role": "user", |
| 71 | + "content": { |
| 72 | + "type": "image", |
| 73 | + "data": "base64_encoded_image_data", |
| 74 | + "mimeType": "image/jpeg" |
| 75 | + } |
| 76 | + }, |
| 77 | + { |
| 78 | + "role": "user", |
| 79 | + "content": { |
| 80 | + "type": "text", |
| 81 | + "text": "Describe what you see in this image." |
| 82 | + } |
| 83 | + } |
| 84 | + ], |
| 85 | + "maxTokens": 200 |
| 86 | +} |
| 87 | +``` |
| 88 | + |
| 89 | +## Diagram |
| 90 | + |
| 91 | +The following diagram visualizes the sampling request flow between |
| 92 | +server and client: |
| 93 | + |
| 94 | +```mermaid |
| 95 | +sequenceDiagram |
| 96 | + participant Server |
| 97 | + participant Client |
| 98 | + participant User |
| 99 | + participant LLM |
| 100 | +
|
| 101 | + Note over Server,Client: Server requests sampling |
| 102 | + Server->>Client: sampling/createMessage |
| 103 | + opt User approval |
| 104 | + Client->>User: Request approval |
| 105 | + User-->>Client: Approve request |
| 106 | + end |
| 107 | + Client->>LLM: Forward request |
| 108 | + LLM-->>Client: Generated response |
| 109 | + opt User approval |
| 110 | + Client->>User: Review response |
| 111 | + User-->>Client: Approve response |
| 112 | + end |
| 113 | + Client-->>Server: CreateMessageResult |
| 114 | +``` |
| 115 | + |
| 116 | +## Messages |
| 117 | + |
| 118 | +This section defines the protocol messages for sampling in the Model Context Protocol (MCP). |
| 119 | + |
| 120 | +### Creating a Message |
| 121 | + |
| 122 | +#### Request |
| 123 | + |
| 124 | +To request sampling from an LLM via the client, the server MUST send a `sampling/createMessage` request. |
| 125 | + |
| 126 | +Method: `sampling/createMessage` |
| 127 | +Params: |
| 128 | + - `messages`: Array of `SamplingMessage` objects representing the conversation history |
| 129 | + - `systemPrompt`: Optional system prompt to use |
| 130 | + - `includeContext`: Optional request to include context from MCP servers |
| 131 | + - `temperature`: Optional sampling temperature |
| 132 | + - `maxTokens`: Maximum tokens to generate |
| 133 | + - `stopSequences`: Optional array of sequences that will stop generation |
| 134 | + - `metadata`: Optional provider-specific metadata |
| 135 | + |
| 136 | +Example: |
| 137 | +```json |
| 138 | +{ |
| 139 | + "jsonrpc": "2.0", |
| 140 | + "id": 1, |
| 141 | + "method": "sampling/createMessage", |
| 142 | + "params": { |
| 143 | + "messages": [ |
| 144 | + { |
| 145 | + "role": "user", |
| 146 | + "content": { |
| 147 | + "type": "text", |
| 148 | + "text": "What is the capital of France?" |
| 149 | + } |
| 150 | + } |
| 151 | + ], |
| 152 | + "systemPrompt": "You are a helpful assistant.", |
| 153 | + "maxTokens": 100, |
| 154 | + "temperature": 0.7, |
| 155 | + "includeContext": "none" |
| 156 | + } |
| 157 | +} |
| 158 | +``` |
| 159 | + |
| 160 | +#### Response |
| 161 | + |
| 162 | +The client MUST respond with a `CreateMessageResult` containing: |
| 163 | + |
| 164 | +- `role`: The role of the message (always "assistant") |
| 165 | +- `content`: The generated content |
| 166 | +- `model`: The name of the model used |
| 167 | +- `stopReason`: Why generation stopped |
| 168 | + |
| 169 | +Example: |
| 170 | +```json |
| 171 | +{ |
| 172 | + "jsonrpc": "2.0", |
| 173 | + "id": 1, |
| 174 | + "result": { |
| 175 | + "role": "assistant", |
| 176 | + "content": { |
| 177 | + "type": "text", |
| 178 | + "text": "The capital of France is Paris." |
| 179 | + }, |
| 180 | + "model": "gpt-4", |
| 181 | + "stopReason": "endTurn" |
| 182 | + } |
| 183 | +} |
| 184 | +``` |
| 185 | + |
| 186 | +## Error Handling |
| 187 | + |
| 188 | +Clients MUST be prepared to handle both user rejection of sampling requests and model API errors. Common error scenarios include: |
| 189 | + |
| 190 | +- User denies the sampling request |
| 191 | +- Model API is unavailable |
| 192 | +- Invalid sampling parameters |
| 193 | +- Context length exceeded |
| 194 | + |
| 195 | +The client SHOULD return appropriate error responses to the server in these cases. |
| 196 | + |
| 197 | +## Security Considerations |
| 198 | + |
| 199 | +Implementations MUST carefully consider the security implications of allowing servers to request model generations, including: |
| 200 | + |
| 201 | +- User consent and approval of sampling requests |
| 202 | +- Permissions around which servers can access which models |
| 203 | +- Content filtering and moderation |
| 204 | +- Rate limiting to prevent abuse |
| 205 | +- Privacy considerations around included context |
0 commit comments