Skip to content

perf: use deque for async response chunk iteration#1362

Open
giulio-leone wants to merge 1 commit intosimonw:mainfrom
giulio-leone:fix/async-iter-deque-performance
Open

perf: use deque for async response chunk iteration#1362
giulio-leone wants to merge 1 commit intosimonw:mainfrom
giulio-leone:fix/async-iter-deque-performance

Conversation

@giulio-leone
Copy link

Problem

AsyncResponse.__anext__() replays cached chunks when the response is already done via list.pop(0), which is O(n) per removal. For long responses with many chunks, replaying becomes O(n²).

Solution

Switch _iter_chunks from list to collections.deque, replacing .pop(0) with .popleft() for O(1) front removal.

Changes

  • llm/models.py:
    • Import deque from collections
    • _iter_chunks = deque(self._chunks) instead of list(...)
    • .pop(0).popleft()

Testing

  • Syntax verified via ast.parse()

AsyncResponse.__anext__() replays cached chunks via list.pop(0) when
the response is already done, which is O(n) per removal.  Switch to
collections.deque with popleft() for O(1).
@giulio-leone giulio-leone force-pushed the fix/async-iter-deque-performance branch from 553ccca to 3f80a2a Compare February 28, 2026 14:41
@giulio-leone
Copy link
Author

Friendly ping — CI is green and this is ready for review. Happy to address any feedback. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant