refactor(chat): extract SkillTagBuffer from chat_completion stream (A-6, PR-3D-c) by AVADSA25 · Pull Request #74 · AVADSA25/codec

AVADSA25 · 2026-05-22T15:37:37Z

Summary

Final piece of the PR-3D split — extracts the streaming <think> + [SKILL:...] tag-machine out of codec_dashboard.chat_completion._stream_gen into a new, tested module, exactly as the audit recommended.

New `codec_chat_stream.py`

SkillTagBuffer — the stateful token processor. feed(token) / finish() are generators that yield clean text fragments to emit; it strips <think>…</think> across chunks and buffers [SKILL:name:query] tags char-by-char so a raw tag never leaks, resolving complete tags via an injected resolve_skill_tag(raw) callback (skill execution is I/O, hence injected). visible_chars lets the caller detect the all-tags-dropped blank-bubble case.
SKILL_TAG_RE — the shared tag pattern (buffer detection + the dashboard resolver).

_stream_gen now keeps only the SSE/HTTP plumbing (POST, iter_lines, data:/[DONE] framing, keepalive, blank-bubble fallback) + the injected _resolve_skill_tag (budget + allowlist + dispatch). chat_completion 466 → 379 LOC.

Behavior preserved exactly

Including the subtle quirks: a same-chunk </think> is dropped (token zeroed after <think>); think-adjacent text is emitted but not counted toward visible_chars; dropped (resolved-to-empty) tags still emit their empty frame; the 5000-char safety cap; cross-chunk tag assembly. The non-streaming post-LLM [SKILL:] path is untouched.

Bonus: SkillTagBuffer is the tested unit the deferred A-12 dashboard-stream migration needs — the dashboard stream can now consume codec_llm.stream()'s raw tokens through it.

Test plan

tests/test_chat_stream.py — 13 tests: passthrough; <think> cross-chunk + same-chunk-drop quirk; tag resolved / dropped (no leak) / assembled-across-tokens; non-tag bracket passthrough; prefix divergence; 5000-char cap; finish() flush; regex.
Full suite: 1464 passed, 23 known-baseline failures, zero new, 74 skipped (chat-path suites — step_budget, agents — all green).
Ruff: new module + test clean; codec_dashboard.py F-delta vs origin/main = 0.
No skills/ touched → no manifest regen.
Manual (Mac Studio): a streaming chat reply renders; a [SKILL:...] the LLM emits resolves inline (no raw tag leak); an all-dropped-tags response shows the fallback bubble.

PR-3D complete

All three monoliths decomposed — A-7 Agent.run (#72), A-5 _dispatch_inner (#73), A-6 chat_completion (this PR).

🤖 Generated with Claude Code

…-6, PR-3D-c) Final piece of the PR-3D split. Extracts the streaming <think> + [SKILL:...] tag-machine out of codec_dashboard.chat_completion._stream_gen into a new, tested module — exactly as the audit recommended. New codec_chat_stream.py: - SkillTagBuffer — the stateful token processor. feed(token)/finish() are generators that yield clean text fragments to emit; it strips <think>…</think> across chunks and buffers [SKILL:name:query] tags char-by-char so a raw tag never leaks, resolving complete tags via an injected resolve_skill_tag(raw) callback (skill execution is I/O, hence injected). visible_chars lets the caller detect the all-tags-dropped blank-bubble case. - SKILL_TAG_RE — the shared tag pattern (buffer detection + the dashboard resolver). _stream_gen now keeps only the SSE/HTTP plumbing (POST, iter_lines, data:/[DONE] framing, keepalive, blank-bubble fallback) + the injected _resolve_skill_tag (budget + allowlist + dispatch). chat_completion 466 -> 379 LOC. Behavior preserved EXACTLY, including the subtle quirks: a same-chunk </think> is dropped (token zeroed after <think>), think-adjacent text is emitted but not counted toward visible_chars, dropped (resolved-to-empty) tags still emit their empty frame, the 5000-char safety cap, and cross-chunk tag assembly. The non-streaming post-LLM [SKILL:] path is untouched. Bonus: SkillTagBuffer is the tested unit the deferred A-12 dashboard-stream migration needs — the dashboard stream can now consume codec_llm.stream()'s raw tokens through it. Tests: tests/test_chat_stream.py (13 — passthrough, think cross-chunk + same-chunk quirk, tag resolved/dropped/assembled-across-tokens, non-tag bracket passthrough, prefix divergence, 5000 cap, finish flush, regex). Full suite 1464 passing, 23 known-baseline failures, zero new. Zero net-new ruff. No skills/ touched. PR-3D complete: all three monoliths decomposed (A-7 #72, A-5 #73, A-6 here). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

AVADSA25 merged commit 8a64b9c into main May 22, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(chat): extract SkillTagBuffer from chat_completion stream (A-6, PR-3D-c)#74

refactor(chat): extract SkillTagBuffer from chat_completion stream (A-6, PR-3D-c)#74
AVADSA25 merged 1 commit into
mainfrom
fix/pr3d-c-chat-stream

AVADSA25 commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AVADSA25 commented May 22, 2026

Summary

New codec_chat_stream.py

Behavior preserved exactly

Test plan

PR-3D complete

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

New `codec_chat_stream.py`