-
Notifications
You must be signed in to change notification settings - Fork 510
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Description
The Query.close() method in _internal/query.py can hang indefinitely
if tasks in the anyio task group don't properly respond to cancellation.
This causes anyio's _deliver_cancellation() to spin at 100% CPU.
Affected Code
src/claude_code_sdk/_internal/query.py lines 550-558 (v0.1.10):
async def close(self) -> None:
"""Close the query and transport."""
self._closed = True
if self._tg:
self._tg.cancel_scope.cancel()
# Wait for task group to complete cancellation
with suppress(anyio.get_cancelled_exc_class()):
await self._tg.__aexit__(None, None, None) # ⚠️ NO TIMEOUT
await self.transport.close()
Root Cause
Line 557 await self._tg.__aexit__(None, None, None) has no timeout. If any
task in the task group doesn't properly respond to cancellation (e.g.,
stuck in I/O, waiting on external resource), this call will hang
indefinitely.
When this happens:
1. anyio's _deliver_cancellation() runs in a busy loop trying to deliver
the cancellation
2. This consumes 100%+ CPU indefinitely
3. The caller's event loop becomes unresponsive
How We Discovered This
We're using ClaudeSDKClient in a JupyterLab extension. After calling
client.disconnect() (which calls Query.close()), the Python process would
sometimes spike to 100-150% CPU usage.
Using py-spy profiling, we found:
- _deliver_cancellation from anyio/_backends/_asyncio.py consuming 66% of
CPU
- current_task from asyncio/tasks.py consuming 35% of CPU
Suggested Fix
Add a timeout to the task group cleanup:
async def close(self) -> None:
"""Close the query and transport."""
self._closed = True
if self._tg:
self._tg.cancel_scope.cancel()
with suppress(anyio.get_cancelled_exc_class()):
try:
# Add timeout to prevent indefinite hang
with anyio.fail_after(5.0):
await self._tg.__aexit__(None, None, None)
except TimeoutError:
logger.warning("Task group cleanup timed out after 5s")
await self.transport.close()
Workaround
Callers can wrap disconnect() with a timeout:
try:
await asyncio.wait_for(client.disconnect(), timeout=5.0)
except asyncio.TimeoutError:
logger.warning("Client disconnect timed out")
finally:
client = None
Environment
- claude-agent-sdk version: 0.1.10
- anyio version: 4.11.0
- Python: 3.11.4
- Platform: Linux (Amazon ECS/Fargate)
---lukaszsamsonparkghost
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working