-
Notifications
You must be signed in to change notification settings - Fork 174
Refactor agent API to support OpenAI message format #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
a649a9c
Refactor agent API to support OpenAI message format
virrius 1422cf2
Add MessagesList model with base64 image URL truncation
virrius eea650c
docs: update API documentation for ChatCompletionMessageParam format
hijera 52644d0
docs: update image examples to use local files instead of URLs
hijera 4d5119f
some fixes
hijera e0a455b
research_with_messages -> research_with_images
hijera File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,96 @@ | ||
| # Research with Messages | ||
|
|
||
| Example demonstrating the use of `task_messages` with multimodal content in OpenAI format. | ||
|
|
||
| ## Description | ||
|
|
||
| This example shows how to send messages with images to SGR Agent Core API. The API accepts messages in OpenAI format, allowing you to include multimodal content (text and images) in your requests. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| 1. SGR Agent Core API server must be running: | ||
|
|
||
| ```bash | ||
| sgr --config-file examples/sgr_deep_research/config.yaml | ||
| ``` | ||
|
|
||
| 2. The server should be accessible at `http://localhost:8010/v1` | ||
|
|
||
| ## Usage | ||
|
|
||
| Run the example script: | ||
|
|
||
| ```bash | ||
| python examples/research_with_messages/research_with_messages.py | ||
| ``` | ||
|
|
||
| ## Example: Message with Image | ||
|
|
||
| The example demonstrates sending a message with both text and an image: | ||
|
|
||
| ```python | ||
| from openai import OpenAI | ||
| import base64 | ||
| from pathlib import Path | ||
|
|
||
| client = OpenAI(base_url="http://localhost:8010/v1", api_key="dummy") | ||
|
|
||
|
|
||
| def encode_image(image_path: str) -> str: | ||
| with open(image_path, "rb") as image_file: | ||
| return base64.b64encode(image_file.read()).decode("utf-8") | ||
|
|
||
|
|
||
| image_path = Path(__file__).parent / "sgr_concept.png" | ||
| base64_image = encode_image(str(image_path)) | ||
|
|
||
| response = client.chat.completions.create( | ||
| model="custom_research_agent", | ||
| messages=[ | ||
| { | ||
| "role": "user", | ||
| "content": [ | ||
| { | ||
| "type": "text", | ||
| "text": "This is the SGR Agent Core architecture diagram. Explain how Schema-Guided Reasoning works based on this diagram.", | ||
| }, | ||
| { | ||
| "type": "image_url", | ||
| "image_url": {"url": f"data:image/png;base64,{base64_image}"}, | ||
| }, | ||
| ], | ||
| }, | ||
| ], | ||
| stream=True, | ||
| ) | ||
|
|
||
| for chunk in response: | ||
| if chunk.choices[0].delta.content: | ||
| print(chunk.choices[0].delta.content, end="") | ||
| ``` | ||
|
|
||
| ## How It Works | ||
|
|
||
| 1. **Image Encoding**: The image file is read and encoded to base64 format | ||
| 2. **Multimodal Content**: The message content is a list containing both text and image parts | ||
| 3. **Message Format**: The message follows OpenAI's multimodal message format with `type: "text"` and `type: "image_url"` | ||
| 4. **Agent Processing**: The agent receives the complete message including the image and can analyze it | ||
|
|
||
| ## Message Format | ||
|
|
||
| Messages follow OpenAI's `ChatCompletionMessageParam` format: | ||
|
|
||
| - `role`: One of `"system"`, `"user"`, `"assistant"`, or `"tool"` | ||
| - `content`: Can be: | ||
| - A string for text-only messages | ||
| - A list of content parts for multimodal messages: | ||
| - `{"type": "text", "text": "..."}` for text | ||
| - `{"type": "image_url", "image_url": {"url": "..."}}` for images | ||
| - Optional fields: `name`, `tool_calls`, `tool_call_id` | ||
|
|
||
| ## Notes | ||
|
|
||
| - Images must be base64-encoded and prefixed with the data URI scheme (`data:image/png;base64,`) | ||
| - The agent receives all messages as-is in `task_messages` | ||
| - Prompts are added as separate messages at the end of the context | ||
| - All message content is preserved and passed to the agent | ||
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| import base64 | ||
| from pathlib import Path | ||
|
|
||
| from openai import OpenAI | ||
|
|
||
| client = OpenAI(base_url="http://localhost:8010/v1", api_key="dummy") | ||
|
|
||
|
|
||
| def encode_image(image_path: str) -> str: | ||
| with open(image_path, "rb") as image_file: | ||
| return base64.b64encode(image_file.read()).decode("utf-8") | ||
|
|
||
|
|
||
| image_path = Path(__file__).parent / "sgr_concept.png" | ||
| base64_image = encode_image(str(image_path)) | ||
|
|
||
| response = client.chat.completions.create( | ||
| model="custom_research_agent", | ||
| messages=[ | ||
| { | ||
| "role": "user", | ||
| "content": [ | ||
| { | ||
| "type": "text", | ||
| "text": ( | ||
| "This is the SGR Agent Core architecture diagram. " | ||
| "Explain how Schema-Guided Reasoning works based on this diagram." | ||
| ), | ||
| }, | ||
| { | ||
| "type": "image_url", | ||
| "image_url": {"url": f"data:image/png;base64,{base64_image}"}, | ||
| }, | ||
| ], | ||
| }, | ||
| ], | ||
| stream=True, | ||
| ) | ||
|
|
||
| for chunk in response: | ||
| if chunk.choices[0].delta.content: | ||
| print(chunk.choices[0].delta.content, end="") |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1 @@ | ||
| Current Date: {current_date} (Year-Month-Day ISO format: YYYY-MM-DD HH:MM:SS) | ||
|
|
||
| CLARIFICATIONS: | ||
|
|
||
| {clarifications} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1 @@ | ||
| Current Date: {current_date} (Year-Month-Day ISO format: YYYY-MM-DD HH:MM:SS) | ||
| ORIGINAL USER REQUEST: | ||
|
|
||
| {task} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Предлагаю переименовать пример из research_with_messages во что-то типа: research_with_images - потому что это чуть более точно отображает происходящее. И во всех остальных местах тоже.