Skip to content

Commit 4095c2e

Browse files
committed
Support of reasoning
1 parent a46e329 commit 4095c2e

File tree

4 files changed

+165
-9
lines changed

4 files changed

+165
-9
lines changed

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,11 @@ If you find this GitHub repository useful, please consider giving it a free star
2424
- [x] Support streaming response via server-sent events (SSE)
2525
- [x] Support Model APIs
2626
- [x] Support Chat Completion APIs
27-
- [x] Support Tool Call (**new**)
28-
- [x] Support Embedding API (**new**)
29-
- [x] Support Multimodal API (**new**)
30-
- [x] Support Cross-Region Inference (**new**)
27+
- [x] Support Tool Call
28+
- [x] Support Embedding API
29+
- [x] Support Multimodal API
30+
- [x] Support Cross-Region Inference
31+
- [x] Support Reasoning (**new**)
3132

3233
Please check [Usage Guide](./docs/Usage.md) for more details about how to use the new APIs.
3334

README_CN.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,11 @@ OpenAI 的 API 或 SDK 无缝集成并试用 Amazon Bedrock 的模型,而无需
2626
- [x] 支持 server-sent events (SSE)的流式响应
2727
- [x] 支持 Model APIs
2828
- [x] 支持 Chat Completion APIs
29-
- [x] 支持 Tool Call (**new**)
30-
- [x] 支持 Embedding API (**new**)
31-
- [x] 支持 Multimodal API (**new**)
32-
- [x] 支持 Cross-Region Inference (**new**)
29+
- [x] 支持 Tool Call
30+
- [x] 支持 Embedding API
31+
- [x] 支持 Multimodal API
32+
- [x] 支持 Cross-Region Inference
33+
- [x] 支持 Reasoning Mode (**new**)
3334

3435
请查看[使用指南](./docs/Usage_CN.md)以获取有关如何使用新API的更多详细信息。
3536

docs/Usage.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -317,3 +317,79 @@ curl $OPENAI_BASE_URL/chat/completions \
317317
You can try it with different questions, such as:
318318
1. Hello, who are you? (No tools are needed)
319319
2. What is the weather like today? (Should use get_current_location tool first)
320+
321+
322+
## Reasoning
323+
324+
**Important Notice**: Please carefully review the following points before using reasoning mode for Chat completion API.
325+
- The only model supports Reasoning is Claude 3.7 Sonnet (extended thinking) so far. Please make sure the model supports reasoning.
326+
- The reasoning mode (or thinking mode) is not enabled by default, you must pass additional `reasoning_effort` parameter in your request.
327+
- The reasoning response (CoT, thoughts) is added in an additional tag 'reasoning_content' which is not officially supported by OpenAI. This is to follow [Deepseek Reasoning Model](https://api-docs.deepseek.com/guides/reasoning_model#api-example). This may be changed in the future.
328+
- Please provide the right max_tokens (or max_completion_tokens) in your request. The budget_tokens is based on reasoning_effort (low: 30%, medium: 60%, high: 100% of max tokens), ensuring minimum budget_tokens of 1,024 with Anthropic recommending at least 4,000 tokens for comprehensive reasoning. Check [Bedrock Document](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-37.html) for more details.
329+
330+
**Example Request**
331+
332+
```bash
333+
curl $OPENAI_BASE_URL/chat/completions \
334+
-H "Content-Type: application/json" \
335+
-H "Authorization: Bearer $OPENAI_API_KEY" \
336+
-d '{
337+
"model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
338+
"messages": [
339+
{
340+
"role": "user",
341+
"content": "which one is bigger, 3.9 or 3.11?"
342+
}
343+
],
344+
"max_completion_tokens": 4096,
345+
"reasoning_effort": "low",
346+
"stream": false
347+
}'
348+
```
349+
350+
**Example Response**
351+
352+
```json
353+
{
354+
"id": "chatcmpl-83fb7a88",
355+
"created": 1740545278,
356+
"model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
357+
"system_fingerprint": "fp",
358+
"choices": [
359+
{
360+
"index": 0,
361+
"finish_reason": "stop",
362+
"logprobs": null,
363+
"message": {
364+
"role": "assistant",
365+
"content": "3.9 is bigger than 3.11.\n\nWhen comparing decimal numbers, we need to understand what these numbers actually represent:...",
366+
"reasoning_content": "I need to compare the decimal numbers 3.9 and 3.11.\n\nFor decimal numbers, we first compare the whole number parts, and if they're equal, we compare the decimal parts. \n\nBoth numbers ..."
367+
}
368+
}
369+
],
370+
"object": "chat.completion",
371+
"usage": {
372+
"prompt_tokens": 51,
373+
"completion_tokens": 565,
374+
"total_tokens": 616
375+
}
376+
}
377+
```
378+
379+
You can also use OpenAI SDK (run `pip3 install -U openai` first )
380+
381+
```python
382+
from openai import OpenAI
383+
client = OpenAI()
384+
385+
messages = [{"role": "user", "content": "which one is bigger, 3.9 or 3.11?"}]
386+
response = client.chat.completions.create(
387+
model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
388+
messages=messages,
389+
reasoning_effort="low",
390+
max_completion_tokens=4096,
391+
)
392+
393+
reasoning_content = response.choices[0].message.reasoning_content
394+
content = response.choices[0].message.content
395+
```

docs/Usage_CN.md

Lines changed: 79 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -313,4 +313,82 @@ curl $OPENAI_BASE_URL/chat/completions \
313313

314314
You can try it with different questions, such as:
315315
1. Hello, who are you? (No tools are needed)
316-
2. What is the weather like today? (Should use get_current_location tool first)
316+
2. What is the weather like today? (Should use get_current_location tool first)
317+
318+
## Reasoning
319+
320+
321+
**重要**: 使用此 reasoning 推理模式前,请仔细阅读以下要点。
322+
323+
- 目前仅 Claude 3.7 Sonnet 模型支持推理功能。使用前请确保所用模型支持推理。
324+
- 推理模式(或思考模式)默认未启用,您必须在请求中传递额外的 reasoning_effort 参数,参数值可选:low,medium, high
325+
- 推理结果(思维链结果、思考过程)被添加到名为 'reasoning_content' 的额外标签中,这不是 OpenAI 官方支持的格式。此设计遵循 [Deepseek Reasoning Model](https://api-docs.deepseek.com/guides/reasoning_model#api-example) 的规范。未来可能会有所变动。
326+
- 请在请求中提供正确的 max_tokens(或 max_completion_tokens)参数。budget_tokens 基于 reasoning_effort 设置(低:30%,中:60%,高:100% 的max tokens),确保最小 budget_tokens 为 1,024,Anthropic 建议至少使用 4,000 个令牌以获得全面的推理。详情请参阅 [Bedrock Document](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-37.html)
327+
328+
**Request 示例**
329+
330+
```bash
331+
curl $OPENAI_BASE_URL/chat/completions \
332+
-H "Content-Type: application/json" \
333+
-H "Authorization: Bearer $OPENAI_API_KEY" \
334+
-d '{
335+
"model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
336+
"messages": [
337+
{
338+
"role": "user",
339+
"content": "which one is bigger, 3.9 or 3.11?"
340+
}
341+
],
342+
"max_completion_tokens": 4096,
343+
"reasoning_effort": "low",
344+
"stream": false
345+
}'
346+
```
347+
348+
349+
**Response 示例**
350+
351+
```json
352+
{
353+
"id": "chatcmpl-83fb7a88",
354+
"created": 1740545278,
355+
"model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
356+
"system_fingerprint": "fp",
357+
"choices": [
358+
{
359+
"index": 0,
360+
"finish_reason": "stop",
361+
"logprobs": null,
362+
"message": {
363+
"role": "assistant",
364+
"content": "3.9 is bigger than 3.11.\n\nWhen comparing decimal numbers, we need to understand what these numbers actually represent:...",
365+
"reasoning_content": "I need to compare the decimal numbers 3.9 and 3.11.\n\nFor decimal numbers, we first compare the whole number parts, and if they're equal, we compare the decimal parts. \n\nBoth numbers ..."
366+
}
367+
}
368+
],
369+
"object": "chat.completion",
370+
"usage": {
371+
"prompt_tokens": 51,
372+
"completion_tokens": 565,
373+
"total_tokens": 616
374+
}
375+
}
376+
```
377+
378+
或者使用 OpenAI SDK (请先运行`pip3 install -U openai` 升级到最新版本)
379+
380+
```python
381+
from openai import OpenAI
382+
client = OpenAI()
383+
384+
messages = [{"role": "user", "content": "which one is bigger, 3.9 or 3.11?"}]
385+
response = client.chat.completions.create(
386+
model="us.anthropic.claude-3-7-sonnet-20250219-v1:0",
387+
messages=messages,
388+
reasoning_effort="low",
389+
max_completion_tokens=4096,
390+
)
391+
392+
reasoning_content = response.choices[0].message.reasoning_content
393+
content = response.choices[0].message.content
394+
```

0 commit comments

Comments
 (0)