Skip to content

Input truncation is automatic on ollama you need to add a single flag to fix itΒ #1372

@devlux76

Description

@devlux76

Which version of the app are you using?

latest head

Which API Provider are you using?

Ollama

Which Model are you using?

All of them

What happened?

Thanks for an excellent product. I love that it integrates with so many providers, but the ollama integration is bugged due to an incredibly short sighted design decision of the ollama team that I don't think you're aware of.

time=2025-03-04T02:30:04.818-07:00 level=WARN source=runner.go:129 msg="truncating input prompt" limit=2048 prompt=16101 keep=4 new=2048

Ollama by default truncates input to 2048 tokens this is really quite tiny.

In order to override this all you need to do is to set the limit in the message options to something reasonable. There is no reason not to use the model's context length - new for this.

"options": {
    "num_ctx": 128000
  }

This can of course cause resource issues, so best practice is to measure what you current have, plus the size of the expected generation and then cap it at the model's max context size. This way you don't waste a bunch of resources when you only need a much smaller context at the moment.

Source: https://github.com/ollama/ollama/blob/main/docs/faq.md

Steps to reproduce

Relevant API REQUEST output

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue - Needs InfoMissing details or unclear. Waiting on author to provide more context.bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions