-
Notifications
You must be signed in to change notification settings - Fork 79
Simplify configure #495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify configure #495
Conversation
Summary of ChangesHello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request streamlines the configuration process for model backends by simplifying the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
ede5246 to
49aed18
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors and simplifies the model configuration logic. The introduction of configure_flags.go to centralize flag handling for docker model configure is a great improvement for maintainability. The simplification of the compose up command and the think parameter handling across the CLI and Ollama API layer are also positive changes.
I've found one issue in cmd/cli/commands/compose.go where the behavior for handling the --context-size flag has unintentionally changed. My review includes a suggestion to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- In
compose up, switching fromcmd.Flags().Changed("context-size")to always constructingContextSizemeans a value of 0 will now be sent toConfigureBackendeven when the flag is not set; if 0 is not a meaningful override, consider restoring theChangedcheck or gating the configuration onctxSize > 0at the request-building point, not just for logging. - The reasoning-budget constants and logic now exist both in the Ollama HTTP handler and in
configure_flags.go; if these are intended to stay aligned, consider centralizing them in a shared package or helper to avoid future divergence.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `compose up`, switching from `cmd.Flags().Changed("context-size")` to always constructing `ContextSize` means a value of 0 will now be sent to `ConfigureBackend` even when the flag is not set; if 0 is not a meaningful override, consider restoring the `Changed` check or gating the configuration on `ctxSize > 0` at the request-building point, not just for logging.
- The reasoning-budget constants and logic now exist both in the Ollama HTTP handler and in `configure_flags.go`; if these are intended to stay aligned, consider centralizing them in a shared package or helper to avoid future divergence.
## Individual Comments
### Comment 1
<location> `cmd/cli/commands/compose.go:84-86` </location>
<code_context>
- sendInfo(fmt.Sprintf("Setting think mode to %s", think))
- }
-
for _, model := range models {
- configuration := inference.BackendConfiguration{
- Speculative: speculativeConfig,
- }
- if cmd.Flags().Changed("context-size") {
- // TODO is the context size the same for all models?
- v := int32(ctxSize)
- configuration.ContextSize = &v
- }
-
- // Set llama.cpp-specific reasoning budget if provided
- if reasoningBudget != nil {
- if configuration.LlamaCpp == nil {
- configuration.LlamaCpp = &inference.LlamaCppConfig{}
- }
- configuration.LlamaCpp.ReasoningBudget = reasoningBudget
- }
-
+ size := int32(ctxSize)
if err := desktopClient.ConfigureBackend(scheduling.ConfigureRequest{
- Model: model,
- Mode: backendMode,
</code_context>
<issue_to_address>
**issue (bug_risk):** Context size is now always sent (including 0) instead of only when explicitly set, which changes previous behavior and may unintentionally override defaults.
Previously `ContextSize` was only set when `cmd.Flags().Changed("context-size")` was true; now we always construct `size := int32(ctxSize)` and send `ContextSize: &size`. When the user omits `--context-size`, we now send an explicit `0` instead of `nil`, bypassing the server default. If `0` is not a valid context size, this will silently alter existing workflows. Please preserve the “only set when flag is provided” behavior (e.g., track whether the flag was set, or use a `*int32` flag as in `ConfigureFlags`) before populating `ContextSize`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Boolean thinking (partial Ollama API) support sounds fine to me, it's a small subset of people that will want to do this anyway |
doringeman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Tested with ⬇️ .
MODEL_RUNNER_HOST=http://localhost:8080 docker model configure --context-size 1024 --mode embedding embeddinggemma
curl -sS 'localhost:8080/v1/embeddings' \
-H 'Content-Type: application/json' \
-d '{"model":"embeddinggemma","input":"hello world"}'
--reasoning-budget Ncan only be 0 or -1 from here:So we can only support partially Ollama API: https://docs.ollama.com/api/generate#body-think-one-of-0
In this PR I also rollback changes of
compose.goas it's not used anymore, current compose integration usesdocker model configureso I have updated that command instead.