Revise sampling spec to define all valid request/response fields

LucaButBoring · LucaButBoring · commit dbf527946295 · 2025-08-13T11:39:05.000-07:00
diff --git a/docs/legacy/concepts/sampling.mdx b/docs/legacy/concepts/sampling.mdx
@@ -107,7 +107,7 @@ The client controls what context is actually included.
 
 Fine-tune the LLM sampling with:
 
-- `temperature`: Controls randomness (0.0 to 1.0)
+- `temperature`: Controls randomness in model responses. Higher values produce higher randomness, and lower values produce more stable output.
 - `maxTokens`: Maximum tokens to generate
 - `stopSequences`: Array of sequences that stop generation
 - `metadata`: Additional provider-specific parameters
diff --git a/docs/specification/draft/client/sampling.mdx b/docs/specification/draft/client/sampling.mdx
@@ -77,10 +77,13 @@ To request a language model generation, servers send a `sampling/createMessage`
           "name": "claude-3-sonnet"
         }
       ],
+      "costPriority": 0.3,
       "intelligencePriority": 0.8,
       "speedPriority": 0.5
     },
+    "temperature": 0.1,
     "systemPrompt": "You are a helpful assistant.",
+    "includeContext": "thisServer",
     "maxTokens": 100
   }
 }
@@ -136,7 +139,8 @@ sequenceDiagram
 
 ### Messages
 
-Sampling messages can contain:
+Sampling messages **MUST** contain a `role` field of `"user"` or `"assistant"`; and
+a `content` field representing the message data. `content` can contain:
 
 #### Text Content
 
@@ -214,6 +218,50 @@ The client processes these preferences to select an appropriate model from its a
 options. For instance, if the client doesn't have access to Claude models but has Gemini,
 it might map the sonnet hint to `gemini-1.5-pro` based on similar capabilities.
 
+### System Prompt
+
+The optional `systemPrompt` field allows servers to request a specific system prompt.
+The client **MAY** modify or ignore this field without communicating this to the server.
+
+### Context Inclusion
+
+The `includeContext` parameter specifies what context information the client is expected
+to include in its response:
+
+- `"none"`: No additional context.
+- `"thisServer"`: Include context from the requesting server.
+- `"allServers"`: Include context from all connected MCP servers.
+
+The client **MAY** modify or ignore this field without communicating this to the server.
+For example, a client could determine that respecting this field in a particular request
+would require sharing sensitive information with a server, and constrain its response
+accordingly.
+
+### Sampling Parameters
+
+LLM sampling can be fine-tuned with the following parameters:
+
+- `temperature`: Controls randomness in model responses. Higher values produce higher randomness, and lower values produce more stable output.
+- `maxTokens`: Maximum tokens to generate; required.
+- `stopSequences`: Array of sequences that stop generation.
+- `metadata`: Additional provider-specific parameters.
+
+The client **MAY** modify or ignore these fields (except for `maxTokens`) without
+communicating this to the server. For example, a client could use a model that does not
+support one or more of these parameters, and would therefore be unable to leverage them.
+
+### Result Fields
+
+Sampling results are [message](#messages) objects, and will contain the following fields:
+
+- `role`: The message role; see [Messages](#messages).
+- `content`: The message content; see [Messages](#messages).
+- `model`: The name of the model that generated the message.
+- `stopReason`: The reason why sampling stopped, if known. The specification defines the following (non-exhaustive) stop reasons, although implementations **MAY** provide their own arbitrary values:
+  - `"endTurn"`: The participant is yielding the conversation to the other party.
+  - `"stopSequence"`: Message generation encountered one of the requested `stopSequences`.
+  - `"maxTokens"`: The token limit was reached.
+
 ## Error Handling
 
 Clients **SHOULD** return errors for common failure cases: