Simplify configure #495

ilopezluna · 2025-12-09T20:09:53Z

--reasoning-budget N can only be 0 or -1 from here:

controls whether thought tags are allowed and/or extracted from the response, and in which format they're returned; one of:

none: leaves thoughts unparsed in message.content

deepseek: puts thoughts in message.reasoning_content

deepseek-legacy: keeps tags in message.content while also populating message.reasoning_content
(default: auto)
(env: LLAMA_ARG_THINK)

So we can only support partially Ollama API: https://docs.ollama.com/api/generate#body-think-one-of-0

In this PR I also rollback changes of compose.go as it's not used anymore, current compose integration uses docker model configure so I have updated that command instead.

gemini-code-assist · 2025-12-09T20:10:20Z

Summary of Changes

Hello @ilopezluna, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request streamlines the configuration process for model backends by simplifying the think parameter to align with the llama.cpp server's limited reasoning budget options. It introduces a dedicated ConfigureFlags struct to centralize and standardize how configuration options are handled across commands, improving maintainability and consistency. The compose command has been refactored to remove outdated configuration logic, ensuring that docker model configure is the primary interface for setting model parameters. These changes are supported by updated documentation and comprehensive test coverage.

Highlights

Simplified Reasoning Budget: The --think parameter for reasoning models has been simplified to accept only boolean values (true/false), which map directly to the llama.cpp server's --reasoning-budget options of -1 (unlimited) or 0 (disabled). This removes support for string-based budget levels like 'high', 'medium', or 'low'.
Centralized Configuration Flags: A new ConfigureFlags struct and associated methods have been introduced in configure_flags.go to centralize the definition, registration, and parsing of all model configuration flags. This promotes consistency and reusability across different commands that configure model backends.
Refactored compose Command: The compose up command has been cleaned up by removing deprecated logic related to mode and think parameters, as its configuration now relies on the updated docker model configure command. This includes the removal of unused constants, helper functions, and flag parsing logic.
Updated Documentation and Tests: Documentation for docker model configure and docker model compose up has been updated to reflect the changes in flag availability and usage. Corresponding unit tests for flag parsing and behavior have also been adjusted or added to validate the new configuration structure and simplified think parameter.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

…figuration

gemini-code-assist

Code Review

This pull request refactors and simplifies the model configuration logic. The introduction of configure_flags.go to centralize flag handling for docker model configure is a great improvement for maintainability. The simplification of the compose up command and the think parameter handling across the CLI and Ollama API layer are also positive changes.

I've found one issue in cmd/cli/commands/compose.go where the behavior for handling the --context-size flag has unintentionally changed. My review includes a suggestion to fix this.

cmd/cli/commands/compose.go

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

In compose up, switching from cmd.Flags().Changed("context-size") to always constructing ContextSize means a value of 0 will now be sent to ConfigureBackend even when the flag is not set; if 0 is not a meaningful override, consider restoring the Changed check or gating the configuration on ctxSize > 0 at the request-building point, not just for logging.
The reasoning-budget constants and logic now exist both in the Ollama HTTP handler and in configure_flags.go; if these are intended to stay aligned, consider centralizing them in a shared package or helper to avoid future divergence.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `compose up`, switching from `cmd.Flags().Changed("context-size")` to always constructing `ContextSize` means a value of 0 will now be sent to `ConfigureBackend` even when the flag is not set; if 0 is not a meaningful override, consider restoring the `Changed` check or gating the configuration on `ctxSize > 0` at the request-building point, not just for logging.
- The reasoning-budget constants and logic now exist both in the Ollama HTTP handler and in `configure_flags.go`; if these are intended to stay aligned, consider centralizing them in a shared package or helper to avoid future divergence.

## Individual Comments

### Comment 1
<location> `cmd/cli/commands/compose.go:84-86` </location>
<code_context>
-				sendInfo(fmt.Sprintf("Setting think mode to %s", think))
-			}
-
 			for _, model := range models {
-				configuration := inference.BackendConfiguration{
-					Speculative: speculativeConfig,
-				}
-				if cmd.Flags().Changed("context-size") {
-					// TODO is the context size the same for all models?
-					v := int32(ctxSize)
-					configuration.ContextSize = &v
-				}
-
-				// Set llama.cpp-specific reasoning budget if provided
-				if reasoningBudget != nil {
-					if configuration.LlamaCpp == nil {
-						configuration.LlamaCpp = &inference.LlamaCppConfig{}
-					}
-					configuration.LlamaCpp.ReasoningBudget = reasoningBudget
-				}
-
+				size := int32(ctxSize)
 				if err := desktopClient.ConfigureBackend(scheduling.ConfigureRequest{
-					Model:                model,
-					Mode:                 backendMode,
</code_context>

<issue_to_address>
**issue (bug_risk):** Context size is now always sent (including 0) instead of only when explicitly set, which changes previous behavior and may unintentionally override defaults.

Previously `ContextSize` was only set when `cmd.Flags().Changed("context-size")` was true; now we always construct `size := int32(ctxSize)` and send `ContextSize: &size`. When the user omits `--context-size`, we now send an explicit `0` instead of `nil`, bypassing the server default. If `0` is not a valid context size, this will silently alter existing workflows. Please preserve the “only set when flag is provided” behavior (e.g., track whether the flag was set, or use a `*int32` flag as in `ConfigureFlags`) before populating `ContextSize`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

cmd/cli/commands/compose.go

ericcurtin · 2025-12-10T00:57:42Z

Boolean thinking (partial Ollama API) support sounds fine to me, it's a small subset of people that will want to do this anyway

doringeman

Thanks!
Tested with ⬇️ .

MODEL_RUNNER_HOST=http://localhost:8080 docker model configure --context-size 1024 --mode embedding embeddinggemma

curl -sS 'localhost:8080/v1/embeddings' \
  -H 'Content-Type: application/json' \
  -d '{"model":"embeddinggemma","input":"hello world"}'

ilopezluna added 2 commits December 9, 2025 19:00

refactor: unify reasoning budget handling and command flags

fdedd7c

replace reasoning budget with think flag for model configuration

14e9faf

refactor: change Think flag to pointer for nil detection in model con…

49aed18

…figuration

ilopezluna force-pushed the simplify-configure branch from ede5246 to 49aed18 Compare December 9, 2025 20:11

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

cmd/cli/commands/compose.go Show resolved Hide resolved

ilopezluna requested a review from a team December 9, 2025 20:17

ilopezluna marked this pull request as ready for review December 9, 2025 20:17

sourcery-ai bot reviewed Dec 9, 2025

View reviewed changes

cmd/cli/commands/compose.go Show resolved Hide resolved

doringeman approved these changes Dec 10, 2025

View reviewed changes

ilopezluna merged commit 6bef10a into main Dec 10, 2025
13 checks passed

ilopezluna deleted the simplify-configure branch December 10, 2025 08:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Simplify configure #495

Simplify configure #495

Uh oh!

ilopezluna commented Dec 9, 2025

Uh oh!

gemini-code-assist bot commented Dec 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

ericcurtin commented Dec 10, 2025 •

edited

Loading

Uh oh!

doringeman left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Simplify configure #495

Simplify configure #495

Uh oh!

Conversation

ilopezluna commented Dec 9, 2025

Uh oh!

gemini-code-assist bot commented Dec 9, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ericcurtin commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

doringeman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ericcurtin commented Dec 10, 2025 •

edited

Loading