Skip to content
This repository was archived by the owner on Oct 6, 2025. It is now read-only.

Conversation

@ilopezluna
Copy link
Contributor

@ilopezluna ilopezluna commented Jul 10, 2025

Add a --backend flag to docker model run and docker model list.
For the run case we use the backend provided as prefix.
For the list case we use the OpenAI's endpoint in case of --backend openai

In both cases you must specify OPENAI_API_KEY

@ilopezluna ilopezluna requested a review from a team July 10, 2025 13:24
@ilopezluna ilopezluna marked this pull request as draft July 11, 2025 07:22
@ilopezluna ilopezluna marked this pull request as ready for review July 11, 2025 09:00
Copy link
Contributor

@xenoscopic xenoscopic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, certainly the logic and design looks great, just had a few minor nit suggestions.

commands/list.go Outdated
c.Flags().BoolVar(&jsonFormat, "json", false, "List models in a JSON format")
c.Flags().BoolVar(&openai, "openai", false, "List models in an OpenAI format")
c.Flags().BoolVarP(&quiet, "quiet", "q", false, "Only show model IDs")
c.Flags().StringVar(&backend, "backend", "", "Specify the backend to use (llama.cpp, openai)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c.Flags().StringVar(&backend, "backend", "", "Specify the backend to use (llama.cpp, openai)")
c.Flags().StringVar(&backend, "backend", "", "Specify the backend to use ("+
strings.Join(slices.Collect(maps.Keys(ValidBackends)), ", ")+")")

Even add a function to return the keys as it'd be used in at least 3 places - validateBackend, flag for list, flag for run.

func ValidBackendsKeys() []string {
	return slices.Collect(maps.Keys(ValidBackends))
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed here

@ilopezluna ilopezluna requested a review from doringeman July 14, 2025 11:17
Copy link
Collaborator

@doringeman doringeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ilopezluna ilopezluna merged commit ede328c into main Jul 14, 2025
6 checks passed
@ilopezluna ilopezluna deleted the backend-flag-support branch July 14, 2025 11:35
@ericcurtin
Copy link
Contributor

Haven't installed this yet, but what's the difference between openai and llama.cpp in this context? Since llama-server from llama.cpp provides an openai-compatible endpoint?

@ilopezluna
Copy link
Contributor Author

Haven't installed this yet, but what's the difference between openai and llama.cpp in this context? Since llama-server from llama.cpp provides an openai-compatible endpoint?

The backend defines the engine responsible for running the inference. When the backend is set to llama.cpp, the inference is executed using llama.cpp.
A new openai backend its been added, allowing inference to be performed via OpenAI.
This new flag is meant to define the engine used to do inference:

We can include other backends as long as they support OpenAI compatible API.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants