Fix parsing failure for large JSON outputs split across buffers #50

reshab48 · 2025-06-28T09:50:51Z

This implementation addresses parsing failures that occur when large JSON responses from Claude Code CLI are split across multiple lines due to stream buffering. The solution introduces an incomplete JSON buffer reconstruction system with deferred yielding to ensure reliable message parsing.

Implementation Details

Core Problem

When JSON objects exceed buffer boundaries, they arrive split across multiple lines:
Line 1: {"type": "message", "data": "very long content that gets cut off mid-
Line 2: way through the JSON object"}

Solution Architecture

Incomplete JSON Detection

Monitor json.JSONDecodeError exceptions
Distinguish between incomplete JSON (starts with {/[) vs genuinely malformed JSON
Only treat as incomplete if the JSON appears to be cut off mid-stream

Buffer Reconstruction System

incomplete_json_line_str = None  # Persistent buffer across iterations

if incomplete_json_line_str:
    # Reconstruct: previous incomplete + current line
    line_str = incomplete_json_line_str + line_str

Deferred Yielding Pattern

Since reconstruction may reprocess the same line content multiple times, yielding must be
deferred:

parsed_json_outputs = []  # Collection phase

# Parse all JSON objects in the line
for json_line in line_str.split("\n"):
    try:
        data = json.loads(json_line)
        incomplete_json_line_str = None  # Clear buffer on success
        parsed_json_outputs.append(data)  # Collect, don't yield
    except json.JSONDecodeError:
        if json_line.startswith("{"):
            incomplete_json_line_str = line_str  # Buffer entire line
            break  # Stop processing this line

Yielding phase - only when no incomplete JSON pending

if not incomplete_json_line_str:
    for json_output_data in parsed_json_outputs:
        yield json_output_data

Multi-Iteration Processing Flow

Iteration 1 (Line: valid_json1\nvalid_json2\nincomplete_large_json_part1):

Parse valid_json1 ✅ → collect
Parse valid_json2 ✅ → collect
Parse incomplete_large_json_part1 ❌ → buffer entire line
Skip yielding phase (incomplete JSON pending)

Iteration 2 (Line: large_json_part2):

Reconstruct: valid_json1\nvalid_json2\nincomplete_large_json_part1large_json_part2
Parse valid_json1 ✅ → collect
Parse valid_json2 ✅ → collect
Parse complete_large_json ✅ → collect, clear buffer
Yield all three messages

Key Design Decisions

Line-level buffering: Buffer the entire line (not just the incomplete JSON) to handle mixed
valid/invalid content
Break on incomplete: Stop processing the current line when incomplete JSON is detected
Conditional yielding: Only yield when no incomplete JSON is pending reconstruction
Buffer reset: Clear the incomplete buffer only after successful parsing

Error Handling

Malformed but complete JSON (ends with } or ]) still raises CLIJSONDecodeError
Only buffer JSON that appears genuinely incomplete due to stream splitting
Preserve existing error semantics for debugging

Impact

This implementation ensures that large JSON responses are never lost due to buffer boundaries
while maintaining the integrity of the message stream and preventing duplicate outputs during
the reconstruction process.

…ries

ltawfik · 2025-07-01T05:25:36Z

Thanks, closing in favor of #53 which provides a cleaner solution without the critical bugs (infinite loop, silent data loss) found in this implementation.

Fix parsing failure for large JSON outputs split across buffer bounda…

a05469f

…ries

ltawfik closed this Jul 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix parsing failure for large JSON outputs split across buffers #50

Fix parsing failure for large JSON outputs split across buffers #50

reshab48 commented Jun 28, 2025 •

edited

Loading

Uh oh!

ltawfik commented Jul 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix parsing failure for large JSON outputs split across buffers #50

Fix parsing failure for large JSON outputs split across buffers #50

Conversation

reshab48 commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation Details

Solution Architecture

Yielding phase - only when no incomplete JSON pending

Multi-Iteration Processing Flow

Key Design Decisions

Error Handling

Impact

Uh oh!

ltawfik commented Jul 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

reshab48 commented Jun 28, 2025 •

edited

Loading