🤦 Gotcha: Spent Hours Debugging… Turns Out It Was the Context Limit #2203

nehadhirmiz · 2026-03-25T13:00:45Z

nehadhirmiz
Mar 25, 2026

I ran into a subtle but critical issue with local agents using Ollama-based LLMs that might help others.
A simple query, Write a Python script that outputs "Hello World", could not be executed by multiple local models. I tried quite a few LLMs ranging from 1B–9B (with and without thinking enabled). Not a single model could complete this simple task.
When I asked the agent to provide a list of tools it had access to, it hallucinated tools and schemas that didn't exist. The issue was not with the LLMs; it was the default context window of Ollama, set to 4K.
As a seasoned ML developer, I feel embarrassed sharing that this small detail slipped my mind. I just wanted to share this so someone else doesn't spend hours trying to figure out why their local agents can't even perform the simplest task.

The lesson: it’s always good to have a simple check. I don’t think this will be the last time someone runs into this issue.
It may be worth having a checking mechanism that runs a few simple tests on local models and generates a performance report—some sort of test run with metrics captured.
If the agent can’t even complete simple tasks, that’s an immediate signal that something fundamental is wrong.

Context window 4K (default Ollama setting)

Context window 32K

Mason Daugherty (mdrxy) · 2026-03-25T15:17:36Z

Mason Daugherty (mdrxy)
Mar 25, 2026
Maintainer

Ah yeah we don't ship model profiles for Ollama yet, which would have set the limit to match the model card (which is 128k tokens -- why are you limiting to 32k?)

Are you setting this up in the config file?

3 replies

nehadhirmiz Mar 26, 2026
Author

I appreciate the response and the information. 32k was just what I randomly chose to increase it by. No, I don’t have it set up in the config file, but I would love to see how you do that if you have some examples or links to share.

Mason Daugherty (mdrxy) Mar 26, 2026
Maintainer

just what I randomly chose to increase it by

by what mechanism did you increase it?

here are our config docs for overrides: https://docs.langchain.com/oss/python/deepagents/cli/configuration#profile-overrides

nehadhirmiz Mar 27, 2026
Author

Thank you very much for sharing. My apolgoies for not answering that earlier, I just used the dial in the Ollama UI.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🤦 Gotcha: Spent Hours Debugging… Turns Out It Was the Context Limit #2203

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

🤦 Gotcha: Spent Hours Debugging… Turns Out It Was the Context Limit #2203

Uh oh!

Uh oh!

nehadhirmiz Mar 25, 2026

Replies: 1 comment · 3 replies

Uh oh!

Mason Daugherty (mdrxy) Mar 25, 2026 Maintainer

Uh oh!

nehadhirmiz Mar 26, 2026 Author

Uh oh!

Mason Daugherty (mdrxy) Mar 26, 2026 Maintainer

Uh oh!

Uh oh!

nehadhirmiz Mar 27, 2026 Author

nehadhirmiz
Mar 25, 2026

Replies: 1 comment 3 replies

Mason Daugherty (mdrxy)
Mar 25, 2026
Maintainer

nehadhirmiz Mar 26, 2026
Author

Mason Daugherty (mdrxy) Mar 26, 2026
Maintainer

nehadhirmiz Mar 27, 2026
Author