Skip to content

Conversation

@ilopezluna
Copy link
Contributor

Address #515

Revert removal of --runtime-args as its being used by users, so when I removed in here I broke their workflows.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical user workflow issue by reintroducing the ability to pass raw runtime flags to the inference backends. It integrates this functionality across the CLI commands and backend configuration, ensuring that users can customize model runtime behavior as needed. The change restores previously broken workflows and enhances the flexibility of model deployment and configuration.

Highlights

  • Reintroduction of --runtime-args: The --runtime-args functionality, previously removed, has been reinstated to support existing user workflows. This allows users to pass raw runtime flags directly to the inference engine.
  • CLI Command Updates: Both docker model compose up and docker model configure commands now support the --runtime-flags option. The configure command also accepts flags after a -- separator.
  • Backend Configuration Integration: The BackendConfiguration struct now includes a RuntimeFlags field, and these flags are correctly passed to both LlamaCpp and vLLM inference backends during configuration.
  • Dynamic Mode Detection: The scheduler now automatically detects and sets the backend mode to Embedding if the --embeddings flag is present within the provided runtime flags.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request reintroduces the --runtime-args flag for the compose up command and adds support for passing runtime arguments to the configure command. The changes span across command definitions, backend configurations, and documentation to support passing these flags down to the inference engines. The implementation correctly handles parsing flags from both a raw string and a string slice. I've found one critical issue where swapped arguments in an error message formatting call would lead to a panic. Otherwise, the changes look good and include necessary tests and documentation updates.

ilopezluna and others added 2 commits December 16, 2025 11:13
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@ilopezluna ilopezluna changed the title [WIP] get --runtime-args back get --runtime-args back Dec 16, 2025
@ilopezluna ilopezluna marked this pull request as ready for review December 16, 2025 10:14
@ilopezluna ilopezluna requested a review from a team December 16, 2025 10:14
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `cmd/cli/commands/configure.go:44` </location>
<code_context>
+		RunE: func(cmd *cobra.Command, args []string) error {
</code_context>

<issue_to_address>
**issue (bug_risk):** Runtime flags slice includes the model name and likely the literal "--"

Here `opts.RuntimeFlags = args[1:]` will still include the literal "--" when present (e.g. `docker model configure foo -- --embeddings` yields `RuntimeFlags = ["--", "--embeddings"]`). To avoid depending on positional slicing and to drop "--" explicitly, consider mirroring the `Args` logic and using `cmd.ArgsLenAtDash()`: treat `args[:argsBeforeDash]` as the model (length 1) and `args[argsBeforeDash+1:]` as runtime flags.
</issue_to_address>

### Comment 2
<location> `cmd/cli/commands/compose.go:99` </location>
<code_context>
+				if err := desktopClient.ComposeUp(ctx, compose.UpRequest{
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Error formatting string and logging call have minor issues and inconsistent formatting

The new `configErrFmtString` has a double space before `"and runtime-flags"`, and the `_ = sendErrorf(...)` line appears mis-indented compared to gofmt output. Also, the format string now has three verbs (`%s`, `%d`, `%s`); please ensure both `sendErrorf` and `fmt.Errorf` pass exactly three matching arguments.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

if err != nil {
return err
}
opts.RuntimeFlags = args[1:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Runtime flags slice includes the model name and likely the literal "--"

Here opts.RuntimeFlags = args[1:] will still include the literal "--" when present (e.g. docker model configure foo -- --embeddings yields RuntimeFlags = ["--", "--embeddings"]). To avoid depending on positional slicing and to drop "--" explicitly, consider mirroring the Args logic and using cmd.ArgsLenAtDash(): treat args[:argsBeforeDash] as the model (length 1) and args[argsBeforeDash+1:] as runtime flags.

@ilopezluna ilopezluna force-pushed the get-back-runtime-args branch from c6114f2 to 39188cc Compare December 16, 2025 12:21
@ilopezluna ilopezluna merged commit 41dd3f1 into main Dec 16, 2025
13 checks passed
@ilopezluna ilopezluna deleted the get-back-runtime-args branch December 16, 2025 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants