Support Local Usage of Google's Gemma Open-Source Models via Terminal #5945
Replies: 2 comments 1 reply
-
I completely agree — having support for running open-source Gemma models locally in the CLI would be a fantastic addition. The ideas you’ve outlined — like a setup command, hardware capability checks, caching, and GPU/CPU fallback — make a lot of sense. This would make the CLI far more versatile and research-friendly. |
Beta Was this translation helpful? Give feedback.
-
FYI I have a downstream fork of gemini-cli that keeps up with their main. https://github.com/acoliver/llxprt-code It adds support for local models. You would do llxprt --provider openai --baseurl 127.0.0.1:1234/v1/ --model gemma-3n-it We also offer configurable prompts. The gemini-cli prompt is REALLY long so we put the default set in ~/.llxprt/prompts/ and you can edit them or orverride them per provider/model. you can also save /profile save "mysetup" and then just do llxprt --profile-load mysetup with everything preconfigured. The Gemini Code Assist team developing gemini-cli has responded to this multiple times that they intend to focus exclusively on Gemini models "for now." I think this is the right decision as it is a massive undertaking and they have a lot ot develop. Community forks downstream can give developers more choice and control including local models. If you come downstream you can enjoy all of the features of gemini-cli along with your local models, other providers and features like claude code style todolists! ![]() |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
It would be great if this terminal tool could support running some of Google's open-source AI models like gemma3n:e2b, gemma3n:e4b, and other Gemma variants locally. These models are freely available and would enable users to leverage powerful LLM capabilities directly from their own machine, improving privacy, speed, and flexibility. Integrating local Gemma model support could make the CLI much more versatile for academic and research workflows.
Why this matters:
Requested capabilities (MVP → nice-to-have):
cli gemma setup --model gemma3n:e2b
).--local
flag or configmode=local
).Stretch:
Possible implementation outline:
backends/
abstraction:RemoteBackend
,GemmaLocalBackend
.Risks / considerations:
Acceptance (initial):
cli --local --model gemma3n:e2b "Hello"
returns coherent text without remote calls.--local
off reverts to current remote behavior unchanged.Let me know if you’d like a smaller scoped first PR (e.g. just backend abstraction + one Gemma variant).
Beta Was this translation helpful? Give feedback.
All reactions