-
Notifications
You must be signed in to change notification settings - Fork 7.1k
ollama: default to Responses API for built-ins #8798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is an alternate PR to solving the same problem as <openai#8227>. In this PR, when Ollama is used via `--oss` (or via `model_provider = "ollama`), we default it to use the Responses format. At runtime, we do an Ollama version check, and if the version is older than when Responses support was added to Ollama, we print out a warning. Because there's no way of configuring the wire api for a built-in provider, we temporarily add a new `oss_provider`/`model_provider` called `"ollama-chat"` that will force the chat format. Once the `"chat"` format is fully removed (see <openai#7782>), `ollama-chat` can be removed as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 188273a3bb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
bolinfest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drifkin I have to run to dinner, so I went through this quickly, but I only found one small issue, so this seems promising! Much better than the mut Config stuff from before!
| /// Returns a deprecation notice if Ollama doesn't support the responses wire API. | ||
| pub async fn ollama_chat_deprecation_notice( | ||
| config: &Config, | ||
| ) -> io::Result<Option<DeprecationNoticeEvent>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function returns Result, but it always returns the Ok variant, never Err.
If detect_wire_api returns Err, should that flow through?
Because if only Ok is possible, then the Result wrapper seems unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just pushed a new commit that makes it flow through and logs a warning. I don't think it should propagate as far as preventing startup (because the call itself just a best-effort explanation for users on older versions).
This is an alternate PR to solving the same problem as #8227.
In this PR, when Ollama is used via
--oss(or viamodel_provider = "ollama"), we default it to use the Responses format. At runtime, we do an Ollama version check, and if the version is older than when Responses support was added to Ollama, we print out a warning.Because there's no way of configuring the wire api for a built-in provider, we temporarily add a new
oss_provider/model_providercalled"ollama-chat"that will force the chat format.Once the
"chat"format is fully removed (see #7782),ollama-chatcan be removed as well