Skip to content

Stability: Prevent Backend Crash on Long Conversations (Rolling Window Context) #126

@sharma-sugurthi

Description

@sharma-sugurthi

What feature do you want to see added?

I was looking through the conversation handling logic, and I noticed that we currently just append every new message to the history list indefinitely.

Since Local LLMs (like Mistral/Llama) have a hard context limit (e.g., 4096 tokens), this makes frustation.

If a user talks for long enough, the total prompt size (System Instruction + History + New Question) will eventually hit the limit. When that happens, the backend will crash with a ValueError: Context dimension mismatch or an OOM error.

Such that I propose we should implement a "Sliding Window" strategy to keep the context size safe.

The idea is :

  1. Check Size: Before sending history to the LLM, check if it's too big.
  2. Trim: If it exceeds a limit (e.g., ~2000 tokens), we remove the oldest messages to make room for the new ones.
  3. Preserve Persona: We must always keep the System Prompt (Index 0) so the bot doesn't forget who it is.

Technical approach

To keep things fast and lightweight, I suggest we avoid installing heavy tokenizers (like transformers).
Instead, we can use a reliable heuristic (e.g., 1 token ≈ 4 characters) to estimate the size.
But i'm also open to change this, if it feels not good!

i can write a simple utility function enforce_context_limit() in chatbot-core to handle this.

This matters most as, this is actually a prerequisite for the Session Persistence work. If we successfully save a long conversation to the database but don't trim it when we reload it, the app will crash immediately upon trying to continue that chat.

Upstream changes

No response

Are you interested in contributing this feature?

Yes,I am happy to work on this implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions