fix: JSON parsing fails for large tool results containing UTF-8 #34

grytrn · 2025-06-21T22:50:20Z

Summary

Fixes #32 - JSON parsing fails at position 130 for large tool results

This PR replaces anyio.TextReceiveStream with raw byte stream reading to fix a critical bug where the SDK fails to parse JSON responses larger than ~10KB that contain UTF-8 characters.

The Problem

When Claude reads files or generates large tool results, the SDK consistently fails with:

json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 131 (char 130)

The issue occurs because TextReceiveStream appears to corrupt large lines containing UTF-8 characters (like the → arrow used in line numbers).

The Solution

This fix:

Removes the dependency on TextReceiveStream
Reads stdout/stderr directly as byte streams
Implements manual line buffering with proper UTF-8 decoding
Handles remaining buffer content after stream closure

Testing

Tested with:

✅ Large file reads (>30KB)
✅ Files containing UTF-8 characters and emojis
✅ Multiple concurrent tool uses
✅ Various line ending formats
✅ Incomplete JSON at stream end

Reproduction

The bug can be reproduced by asking Claude to read any Python file larger than 10KB:

from claude_code_sdk import query, ClaudeCodeOptions

async for msg in query(
    prompt="Read the file whatsapp_claude_chat.py",
    options=ClaudeCodeOptions(allowed_tools=["*"])
):
    print(msg)  # Fails with JSONDecodeError

Technical Details

The root cause appears to be in how anyio.TextReceiveStream handles buffering for large lines. By reading raw bytes and manually handling line splitting and UTF-8 decoding, we avoid this issue entirely.

Fixes anthropics#32 - JSON parsing fails for large tool results The anyio.TextReceiveStream has issues when reading large lines (>10KB) containing UTF-8 characters, causing JSON parsing to fail at position 130. This fix: - Removes dependency on TextReceiveStream - Reads stdout/stderr as raw byte streams - Manually handles line buffering and UTF-8 decoding - Properly processes remaining buffer content The issue occurred because TextReceiveStream appears to have a bug when handling large lines with specific content patterns, particularly those containing UTF-8 characters like the arrow symbol (→) used in line numbers. Tested with: - Large file reads (>30KB) - Files containing UTF-8 characters - Multiple concurrent tool uses

tim-watcha · 2025-06-26T04:12:53Z

This does work for me! Thank you

ltawfik · 2025-06-27T23:57:35Z

Thanks for the PR! This issue was already fixed in commit 97c651b which removed TextReceiveStream and implemented raw byte stream reading for the same problem (issue #32).

ltawfik · 2025-06-28T00:00:58Z

Thanks @grytrn .This issue was already fixed in PR #5. The current implementation correctly handles multiple JSON objects on one line

ltawfik closed this Jun 28, 2025

This was referenced Jun 29, 2025

JSONDecodeError: Subprocess buffer truncates large messages #32

Closed

JSONDecodeError still occurring despite issue #32 being marked as fixed #63

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: JSON parsing fails for large tool results containing UTF-8 #34

fix: JSON parsing fails for large tool results containing UTF-8 #34

Uh oh!

grytrn commented Jun 21, 2025

Uh oh!

tim-watcha commented Jun 26, 2025

Uh oh!

ltawfik commented Jun 27, 2025

Uh oh!

ltawfik commented Jun 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: JSON parsing fails for large tool results containing UTF-8 #34

fix: JSON parsing fails for large tool results containing UTF-8 #34

Uh oh!

Conversation

grytrn commented Jun 21, 2025

Summary

The Problem

The Solution

Testing

Reproduction

Technical Details

Uh oh!

tim-watcha commented Jun 26, 2025

Uh oh!

ltawfik commented Jun 27, 2025

Uh oh!

ltawfik commented Jun 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants