Fix critical KV cache crash bug #771 #777

anivar · 2025-07-20T20:52:26Z

Problem

Issue #771: Production servers crash with std::length_error: vector during KV cache management.

Root Cause

Integer underflow in update_slots() at line 1714:

slot.cache_tokens.resize(slot.cache_tokens.size() - n_discard);

When n_discard >= cache_tokens.size(), the subtraction underflows, requesting massive memory allocation.

Fix

Added bounds checking before resize:

if (n_discard >= 0 && (size_t)n_discard < slot.cache_tokens.size()) {
    slot.cache_tokens.resize(slot.cache_tokens.size() - n_discard);
} else {
    slot.cache_tokens.clear();
}

Testing

Builds successfully
Unit test verifies fix handles edge cases
No crashes with various n_discard values

Fixes production crashes during high memory pressure scenarios.

Resolves issue mozilla-ai#771 where server crashes with std::length_error when KV cache context shifting attempts to resize cache_tokens vector with integer underflow. The bug occurs in update_slots() when n_discard >= cache_tokens.size(), causing cache_tokens.resize(size - n_discard) to underflow and request massive memory allocation, triggering std::length_error exception. Changes: - Add bounds checking before cache_tokens.resize() in server.cpp:1714 - Clear cache_tokens when n_discard would cause underflow - Prevent negative n_discard values from causing issues This fix prevents production server crashes reported with Chinese text translation workloads and high memory pressure scenarios.

github-actions bot added the llama.cpp label Jul 20, 2025

mofosyne approved these changes Aug 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix critical KV cache crash bug #771 #777

Fix critical KV cache crash bug #771 #777

anivar commented Jul 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix critical KV cache crash bug #771 #777

Are you sure you want to change the base?

Fix critical KV cache crash bug #771 #777

Conversation

anivar commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Fix

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anivar commented Jul 20, 2025 •

edited

Loading