-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
RatingCapture: .slice(0, 500) can split UTF-16 surrogate pairs, breaking statusline LEARNING section #874
Description
What happens
The LEARNING section in the statusline shows "No ratings yet" even when ratings.jsonl has plenty of entries. The sparklines, averages, and trend — all gone.
Why
RatingCapture.hook.ts truncates response_preview with .slice(0, 500) (lines 390, 466, 506). JavaScript's .slice() operates on UTF-16 code units, not characters. If the cut lands between the two halves of a surrogate pair (any emoji, basically), you get an orphaned \ud83d in the JSON string.
The statusline's jq computation uses -rs (slurp mode), which parses the entire ratings.jsonl as one array. One bad entry kills the whole thing — jq bails with Invalid \uXXXX\uXXXX surrogate pair escape, the eval gets empty output, and learning-cache.sh gets written with all blank values. From that point on, every statusline refresh hits the cached blanks until the TTL expires and jq fails again. Loop.
Reproduction
Any response containing emoji near the 500-char boundary will trigger it. In my case it was a 🗣 (HAL summary line) that got sliced right between \ud83d and \udde3.
The broken entry looked like this:
{"timestamp":"...","rating":9,"response_preview":"...Hook rodou corretamente nos dois Edit's\n\n\ud83d"}Fix
Check if the last character after slicing is a high surrogate (0xD800–0xDBFF) and drop it:
function safeSlice(str: string, maxLen: number): string {
const sliced = str.slice(0, maxLen);
if (sliced.length > 0 && sliced.charCodeAt(sliced.length - 1) >= 0xD800 && sliced.charCodeAt(sliced.length - 1) <= 0xDBFF) {
return sliced.slice(0, -1);
}
return sliced;
}Apply to all three truncation sites in RatingCapture.hook.ts:
- Line 390:
cachedResponse.slice(0, 500)→safeSlice(cachedResponse, 500) - Line 466: same
- Line 506:
implicitCachedResponse.slice(0, 500)→safeSlice(implicitCachedResponse, 500)
Also worth noting: if someone already has a corrupted ratings.jsonl, deleting MEMORY/STATE/learning-cache.sh alone won't help — you need to find and fix the broken entry too. Searching for \ud83d" (orphaned high surrogate right before closing quote) in the file will find it.
This issue was identified and filed by Claude Code.