Skip to content

Commit ae516a1

Browse files
authored
refactor: scaffolds anthropic messages structs for tracing (#1464)
**Description** This scaffolds the bare minimum data structures for Anthropic messages, which previously was not defined correctly. The sets the stage for tracing work, which will add more fields in addition to the structs defined here. **Related Issues/PRs (if applicable)** Preparation for #1389 --------- Signed-off-by: Takeshi Yoneda <[email protected]>
1 parent e0cbd6d commit ae516a1

12 files changed

+513
-539
lines changed

docs/why-not-official-sdk.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
## Why not use the "official SDKs"?
2+
3+
In [`internal/apischema`](../internal/apischema), we have defined our own data structures for various providers like OpenAI, Azure OpenAI, Anthropic, etc., for our translation logic as well as for observability purposes.
4+
5+
> Note that there might be some official SDK usage remaining in some places, which is not consistent with this explanation. We are planning to remove them gradually.
6+
7+
It is a FAQ to ask why we are not using the "official SDKs" provided by the providers themselves. The reasons why we are not using the "official SDKs" are:
8+
9+
- Cross provider compatibility we want is not a provider's concern. In other words, they do not care about edge cases we want to handle.
10+
- E.g. https://github.com/openai/openai-go/issues/484 explains even "official SDKs" might not be compatible with the real responses.
11+
- Maintaining a few struct for only supported endpoints in this project is not that hard. We are not trying to cover all endpoints provided by the providers.
12+
- Usually AI providers auto generate SDKs from an ambiguous OpenAPI spec (at least openai/anthropic), which has weird performance cost.
13+
- For example, Go's json marshal/unmarshal performance is not that good compared to other languages. Hence, we are looking for more optimized way to serialize/deserialize payloads.
14+
However, if the definition is auto generated from OpenAPI spec, it is hard to optimize the serialization/deserialization logic as well as sometimes makes it impossible to use the highly optimized libraries like `goccy/go-json`.
15+
- Using them makes it hard for us to add vendor specific fields. Sometimes we want to add vendor specific fields in a nested data structure, which is not possible if we rely on the external packages.
16+
17+
On the other hand, we use official SDKs for testing purposes to make sure our code works as expected.
18+
19+
Previous discussions:
20+
21+
- https://github.com/envoyproxy/ai-gateway/issues/995
22+
- https://github.com/envoyproxy/ai-gateway/pull/1147

internal/apischema/anthropic/anthropic.go

Lines changed: 310 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -3,32 +3,325 @@
33
// The full text of the Apache license is available in the LICENSE file at
44
// the root of the repo.
55

6-
// Package anthropic contains Anthropic API schema definitions using the official SDK types.
76
package anthropic
87

8+
import (
9+
"encoding/json"
10+
"fmt"
11+
"strings"
12+
13+
"github.com/tidwall/gjson"
14+
)
15+
916
// MessagesRequest represents a request to the Anthropic Messages API.
10-
// Uses a dictionary approach to handle any JSON structure flexibly.
11-
type MessagesRequest map[string]any
17+
// https://docs.claude.com/en/api/messages
18+
//
19+
// Note that we currently only have "passthrough-ish" translators for Anthropic,
20+
// so this struct only contains fields that are necessary for minimal processing
21+
// as well as for observability purposes on a best-effort basis.
22+
//
23+
// Notably, round trip idempotency is not guaranteed when using this struct.
24+
type MessagesRequest struct {
25+
// Model is the model to use for the request.
26+
Model string `json:"model,omitempty"`
1227

13-
// Helper methods to extract common fields from the dictionary
28+
// Messages is the list of messages in the conversation.
29+
// https://docs.claude.com/en/api/messages#body-messages
30+
Messages []Message `json:"messages"`
1431

15-
func (m MessagesRequest) GetModel() string {
16-
if model, ok := m["model"].(string); ok {
17-
return model
18-
}
19-
return ""
32+
// MaxTokens is the maximum number of tokens to generate.
33+
// https://docs.claude.com/en/api/messages#body-max-tokens
34+
MaxTokens int `json:"max_tokens,omitempty"`
35+
36+
// Container identifier for reuse across requests.
37+
// https://docs.claude.com/en/api/messages#body-container
38+
Container *Container `json:"container,omitempty"`
39+
40+
// ContextManagement is the context management configuration.
41+
// https://docs.claude.com/en/api/messages#body-context-management
42+
ContextManagement *ContextManagement `json:"context_management,omitempty"`
43+
44+
// MCPServers is the list of MCP servers.
45+
// https://docs.claude.com/en/api/messages#body-mcp-servers
46+
MCPServers []MCPServer `json:"mcp_servers,omitempty"`
47+
48+
// Metadata is the metadata for the request.
49+
// https://docs.claude.com/en/api/messages#body-metadata
50+
Metadata *MessagesMetadata `json:"metadata,omitempty"`
51+
52+
// ServiceTier indicates the service tier for the request.
53+
// https://docs.claude.com/en/api/messages#body-service-tier
54+
ServiceTier *MessageServiceTier `json:"service_tier,omitempty"`
55+
56+
// StopSequences is the list of stop sequences.
57+
// https://docs.claude.com/en/api/messages#body-stop-sequences
58+
StopSequences []string `json:"stop_sequences,omitempty"`
59+
60+
// System is the system prompt to guide the model's behavior.
61+
// https://docs.claude.com/en/api/messages#body-system
62+
System *SystemPrompt `json:"system,omitempty"`
63+
64+
// Temperature controls the randomness of the output.
65+
Temperature *float64 `json:"temperature,omitempty"`
66+
67+
// Thinking is the configuration for the model's "thinking" behavior.
68+
// https://docs.claude.com/en/api/messages#body-thinking
69+
Thinking *Thinking `json:"thinking,omitempty"`
70+
71+
// ToolChoice indicates the tool choice for the model.
72+
// https://docs.claude.com/en/api/messages#body-tool-choice
73+
ToolChoice *ToolChoice `json:"tool_choice,omitempty"`
74+
75+
// Tools is the list of tools available to the model.
76+
// https://docs.claude.com/en/api/messages#body-tools
77+
Tools []Tool `json:"tools,omitempty"`
78+
79+
// Stream indicates whether to stream the response.
80+
Stream bool `json:"stream,omitempty"`
81+
82+
// TopP is the cumulative probability for nucleus sampling.
83+
TopP *float64 `json:"top_p,omitempty"`
84+
85+
// TopK is the number of highest probability vocabulary tokens to keep for top-k-filtering.
86+
TopK *int `json:"top_k,omitempty"`
87+
}
88+
89+
// Message represents a single message in the Anthropic Messages API.
90+
// https://docs.claude.com/en/api/messages#body-messages
91+
type Message struct {
92+
// Role is the role of the message.
93+
Role MessageRole `json:"role"`
94+
95+
// Content is the content of the message.
96+
Content MessageContent `json:"content"`
97+
}
98+
99+
// MessageRole represents the role of a message in the Anthropic Messages API.
100+
// https://docs.claude.com/en/api/messages#body-messages-role
101+
type MessageRole string
102+
103+
const (
104+
MessageRoleUser MessageRole = "user"
105+
MessageRoleAssistant MessageRole = "assistant"
106+
)
107+
108+
// MessageContent represents the content of a message in the Anthropic Messages API.
109+
// https://docs.claude.com/en/api/messages#body-messages-content
110+
type MessageContent struct {
111+
Text string // Non-empty iif this is not array content.
112+
Array []MessageContentArrayElement // Non-empty iif this is array content.
20113
}
21114

22-
func (m MessagesRequest) GetMaxTokens() int {
23-
if maxTokens, ok := m["max_tokens"].(float64); ok {
24-
return int(maxTokens)
115+
// MessageContentArrayElement represents an element of the array content in a message.
116+
// https://docs.claude.com/en/api/messages#body-messages-content
117+
type MessageContentArrayElement struct{} // TODO when we need it for observability, etc.
118+
119+
func (m *MessageContent) UnmarshalJSON(data []byte) error {
120+
// Try to unmarshal as string first.
121+
var text string
122+
if err := json.Unmarshal(data, &text); err == nil {
123+
m.Text = text
124+
return nil
25125
}
26-
return 0
126+
127+
// Try to unmarshal as array of MessageContentArrayElement.
128+
var array []MessageContentArrayElement
129+
if err := json.Unmarshal(data, &array); err == nil {
130+
m.Array = array
131+
return nil
132+
}
133+
return fmt.Errorf("message content must be either string or array")
134+
}
135+
136+
// MessagesMetadata represents the metadata for the Anthropic Messages API request.
137+
// https://docs.claude.com/en/api/messages#body-metadata
138+
type MessagesMetadata struct {
139+
// UserID is an optional user identifier for tracking purposes.
140+
UserID *string `json:"user_id,omitempty"`
27141
}
28142

29-
func (m MessagesRequest) GetStream() bool {
30-
if stream, ok := m["stream"].(bool); ok {
31-
return stream
143+
// MessageServiceTier represents the service tier for the Anthropic Messages API request.
144+
//
145+
// https://docs.claude.com/en/api/messages#body-service-tier
146+
type MessageServiceTier string
147+
148+
const (
149+
MessageServiceTierAuto MessageServiceTier = "auto"
150+
MessageServiceTierStandardOnly MessageServiceTier = "standard_only"
151+
)
152+
153+
// Container represents a container identifier for reuse across requests.
154+
// https://docs.claude.com/en/api/messages#body-container
155+
type Container struct{} // TODO when we need it for observability, etc.
156+
157+
// Tool represents a tool available to the model.
158+
// https://docs.claude.com/en/api/messages#body-tools
159+
type Tool struct{} // TODO when we need it for observability, etc.
160+
161+
// ToolChoice represents the tool choice for the model.
162+
// https://docs.claude.com/en/api/messages#body-tool-choice
163+
type ToolChoice struct{} // TODO when we need it for observability, etc.
164+
165+
// Thinking represents the configuration for the model's "thinking" behavior.
166+
// https://docs.claude.com/en/api/messages#body-thinking
167+
type Thinking struct{} // TODO when we need it for observability, etc.
168+
169+
// SystemPrompt represents a system prompt to guide the model's behavior.
170+
// https://docs.claude.com/en/api/messages#body-system
171+
type SystemPrompt struct{} // TODO when we need it for observability, etc.
172+
173+
// MCPServer represents an MCP server.
174+
// https://docs.claude.com/en/api/messages#body-mcp-servers
175+
type MCPServer struct{} // TODO when we need it for observability, etc.
176+
177+
// ContextManagement represents the context management configuration.
178+
// https://docs.claude.com/en/api/messages#body-context-management
179+
type ContextManagement struct{} // TODO when we need it for observability, etc.
180+
181+
// MessagesResponse represents a response from the Anthropic Messages API.
182+
// https://docs.claude.com/en/api/messages
183+
type MessagesResponse struct {
184+
// ID is the unique identifier for the response.
185+
// https://docs.claude.com/en/api/messages#response-id
186+
ID string `json:"id"`
187+
// Type is the type of the response.
188+
// This is always "messages".
189+
//
190+
// https://docs.claude.com/en/api/messages#response-type
191+
Type ConstantMessagesResponseTypeMessages `json:"type"`
192+
// Role is the role of the message in the response.
193+
// This is always "assistant".
194+
//
195+
// https://docs.claude.com/en/api/messages#response-role
196+
Role ConstantMessagesResponseRoleAssistant `json:"role"`
197+
// Content is the content of the message in the response.
198+
// https://docs.claude.com/en/api/messages#response-content
199+
Content []MessagesContentBlock `json:"content"`
200+
// Model is the model used for the response.
201+
// https://docs.claude.com/en/api/messages#response-model
202+
Model string `json:"model"`
203+
// StopReason is the reason for stopping the generation.
204+
// https://docs.claude.com/en/api/messages#response-stop-reason
205+
StopReason *StopReason `json:"stop_reason,omitempty"`
206+
// StopSequence is the stop sequence that was encountered.
207+
// https://docs.claude.com/en/api/messages#response-stop-sequence
208+
StopSequence *string `json:"stop_sequence,omitempty"`
209+
// Usage contains token usage information for the response.
210+
// https://docs.claude.com/en/api/messages#response-usage
211+
Usage *Usage `json:"usage,omitempty"`
212+
}
213+
214+
// ConstantMessagesResponseTypeMessages is the constant type for MessagesResponse, which is always "messages".
215+
type ConstantMessagesResponseTypeMessages string
216+
217+
// ConstantMessagesResponseRoleAssistant is the constant role for MessagesResponse, which is always "assistant".
218+
type ConstantMessagesResponseRoleAssistant string
219+
220+
// MessagesContentBlock represents a block of content in the Anthropic Messages API response.
221+
// https://docs.claude.com/en/api/messages#response-content
222+
type MessagesContentBlock struct{} // TODO when we need it for observability, etc.
223+
224+
// StopReason represents the reason for stopping the generation.
225+
// https://docs.claude.com/en/api/messages#response-stop-reason
226+
type StopReason string
227+
228+
const (
229+
StopReasonEndTurn StopReason = "end_turn"
230+
StopReasonMaxTokens StopReason = "max_tokens"
231+
StopReasonStopSequence StopReason = "stop_sequence"
232+
StopReasonToolUse StopReason = "tool_use"
233+
StopReasonPauseTurn StopReason = "pause_turn"
234+
StopReasonRefusal StopReason = "refusal"
235+
StopReasonModelContextWindowExceeded StopReason = "model_context_window_exceeded"
236+
)
237+
238+
// Usage represents token usage information for the Anthropic Messages API response.
239+
// https://docs.claude.com/en/api/messages#response-usage
240+
//
241+
// NOTE: all of them are float64 in the API, although they are always integers in practice.
242+
// However, the documentation doesn't explicitly state that they are integers in its format,
243+
// so we use float64 to be able to unmarshal both 1234 and 1234.0 without errors.
244+
type Usage struct {
245+
// The number of input tokens used to create the cache entry.
246+
CacheCreationInputTokens float64 `json:"cache_creation_input_tokens"`
247+
// The number of input tokens read from the cache.
248+
CacheReadInputTokens float64 `json:"cache_read_input_tokens"`
249+
// The number of input tokens which were used.
250+
InputTokens float64 `json:"input_tokens"`
251+
// The number of output tokens which were used.
252+
OutputTokens float64 `json:"output_tokens"`
253+
254+
// TODO: there are other fields that are currently not used in the project.
255+
}
256+
257+
// MessagesStreamEvent represents a single event in the streaming response from the Anthropic Messages API.
258+
// https://docs.claude.com/en/docs/build-with-claude/streaming
259+
type MessagesStreamEvent struct {
260+
// The type of the streaming event.
261+
Type MessagesStreamEventType `json:"type"`
262+
// MessageStart is present if the event type is "message_start" or "message_delta".
263+
MessageStart *MessagesStreamEventMessageStart
264+
// MessageDelta is present if the event type is "message_delta".
265+
MessageDelta *MessagesStreamEventMessageDelta
266+
}
267+
268+
// MessagesStreamEventType represents the type of a streaming event in the Anthropic Messages API.
269+
// https://docs.claude.com/en/docs/build-with-claude/streaming#event-types
270+
type MessagesStreamEventType string
271+
272+
const (
273+
MessagesStreamEventTypeMessageStart MessagesStreamEventType = "message_start"
274+
MessagesStreamEventTypeMessageDelta MessagesStreamEventType = "message_delta"
275+
MessagesStreamEventTypeMessageStop MessagesStreamEventType = "message_stop"
276+
MessagesStreamEventTypeContentBlockStart MessagesStreamEventType = "content_block_start"
277+
MessagesStreamEventTypeContentBlockDelta MessagesStreamEventType = "content_block_delta"
278+
MessagesStreamEventTypeContentBlockStop MessagesStreamEventType = "content_block_stop"
279+
)
280+
281+
// MessagesStreamEventMessageStart represents the message content in a "message_start".
282+
type MessagesStreamEventMessageStart MessagesResponse
283+
284+
// MessagesStreamEventMessageDelta represents the message content in a "message_delta".
285+
//
286+
// Note: the definition of this event is vague in the Anthropic documentation.
287+
// This follows the same code from their official SDK.
288+
// https://github.com/anthropics/anthropic-sdk-go/blob/3a0275d6034e4eda9fbc8366d8a5d8b3a462b4cc/message.go#L2424-L2451
289+
type MessagesStreamEventMessageDelta struct {
290+
// Delta contains the delta information for the message.
291+
// This is cumulative per documentation.
292+
Usage Usage `json:"usage"`
293+
Delta MessagesStreamEventMessageDeltaDelta `json:"delta"`
294+
}
295+
296+
type MessagesStreamEventMessageDeltaDelta struct {
297+
StopReason StopReason `json:"stop_reason"`
298+
StopSequence *string `json:"stop_sequence,omitempty"`
299+
}
300+
301+
func (m *MessagesStreamEvent) UnmarshalJSON(data []byte) error {
302+
eventType := gjson.GetBytes(data, "type")
303+
if !eventType.Exists() {
304+
return fmt.Errorf("missing type field in stream event")
305+
}
306+
m.Type = MessagesStreamEventType(eventType.String())
307+
switch m.Type {
308+
case MessagesStreamEventTypeMessageStart:
309+
messageBytes := gjson.GetBytes(data, "message")
310+
r := strings.NewReader(messageBytes.Raw)
311+
decoder := json.NewDecoder(r)
312+
var message MessagesStreamEventMessageStart
313+
if err := decoder.Decode(&message); err != nil {
314+
return fmt.Errorf("failed to unmarshal message in stream event: %w", err)
315+
}
316+
m.MessageStart = &message
317+
case MessagesStreamEventTypeMessageDelta:
318+
var messageDelta MessagesStreamEventMessageDelta
319+
if err := json.Unmarshal(data, &messageDelta); err != nil {
320+
return fmt.Errorf("failed to unmarshal message delta in stream event: %w", err)
321+
}
322+
m.MessageDelta = &messageDelta
323+
default:
324+
// TODO: handle other event types if needed.
32325
}
33-
return false
326+
return nil
34327
}

0 commit comments

Comments
 (0)