Ollama is not optimal to run your agents #93

jdsalasca · 2025-08-20T00:11:57Z

jdsalasca
Aug 20, 2025

I saw in your ReadMe that you're installing Ollama and running the agent from that place, but that is not efficient for the majority of users. Also, Ollama is highly inefficient, so a lot of people can hit CPU limits running those models. For that reason, I suggest you run a server natively to "mock" the OpenAI connector as an agent. The magic could be in choosing the correct LLM based on User specs, so if it's running on an Nvidia GPU> 3070 with GPU > 8 gb, then pick a GGUf 8b model, if running an Arc > 16 gb/GPU, then run an OpenVino model with 20b parameters...

That part is interesting and could help you improve your project.

God bless you and thanks!

jeanpaul · 2025-08-25T09:54:10Z

jeanpaul
Aug 25, 2025
Maintainer

If you are on Linux and have an Nvidia GPU, you can use the gpu profile when starting the ollama docker. Otherwise, if you are running a mac, there are instructions in the readme on how to connect to the Ollama that's running on your mac directly. Does that match your expectations?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ollama is not optimal to run your agents #93

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Ollama is not optimal to run your agents #93

Uh oh!

jdsalasca Aug 20, 2025

Replies: 1 comment

Uh oh!

jeanpaul Aug 25, 2025 Maintainer

jdsalasca
Aug 20, 2025

jeanpaul
Aug 25, 2025
Maintainer