-
Notifications
You must be signed in to change notification settings - Fork 16
Backend flag support #126
Backend flag support #126
Conversation
xenoscopic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, certainly the logic and design looks great, just had a few minor nit suggestions.
Co-authored-by: Jacob Howard <[email protected]>
Co-authored-by: Jacob Howard <[email protected]>
commands/list.go
Outdated
| c.Flags().BoolVar(&jsonFormat, "json", false, "List models in a JSON format") | ||
| c.Flags().BoolVar(&openai, "openai", false, "List models in an OpenAI format") | ||
| c.Flags().BoolVarP(&quiet, "quiet", "q", false, "Only show model IDs") | ||
| c.Flags().StringVar(&backend, "backend", "", "Specify the backend to use (llama.cpp, openai)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| c.Flags().StringVar(&backend, "backend", "", "Specify the backend to use (llama.cpp, openai)") | |
| c.Flags().StringVar(&backend, "backend", "", "Specify the backend to use ("+ | |
| strings.Join(slices.Collect(maps.Keys(ValidBackends)), ", ")+")") |
Even add a function to return the keys as it'd be used in at least 3 places - validateBackend, flag for list, flag for run.
func ValidBackendsKeys() []string {
return slices.Collect(maps.Keys(ValidBackends))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed here
Co-authored-by: Dorin-Andrei Geman <[email protected]>
doringeman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
Haven't installed this yet, but what's the difference between openai and llama.cpp in this context? Since llama-server from llama.cpp provides an openai-compatible endpoint? |
The backend defines the engine responsible for running the inference. When the backend is set to We can include other backends as long as they support OpenAI compatible API. |
Add a
--backendflag todocker model runanddocker model list.For the run case we use the backend provided as prefix.
For the list case we use the OpenAI's endpoint in case of
--backend openaiIn both cases you must specify OPENAI_API_KEY