fix(cli): stream_reasoning_engine raises StopIteration RuntimeError on sync generators#6110
Closed
surajit-1306 wants to merge 1 commit into
Closed
Conversation
The sync-to-async adapter _aiter_from_iter relied on catching StopIteration from next(iterator). Because next() runs in a threadpool via run_in_threadpool, the StopIteration propagates out of the coroutine frame and Python (PEP 479) converts it to RuntimeError before the except clause runs, making it dead code. Every synchronous streaming class_method (e.g. stream_query) ended with a RuntimeError after the final chunk. Use a sentinel default with next(iterator, sentinel) so iterator exhaustion never raises across the await boundary. Adds a regression test exercising the sync-generator path of /api/stream_reasoning_engine.
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Link to Issue or Description of Change
Closes : #6093
Problem:
On Agent Engine deployments served by the ADK API server, every call to the
/api/stream_reasoning_engineroute with a synchronous streamingclass_method(e.g.
stream_query) ends withRuntimeError: coroutine raised StopIterationafter the last chunk is streamed.
The cause is the sync-to-async adapter
_aiter_from_iterinsrc/google/adk/cli/fast_api.py(lines 916–922 in v2.2.0):The
except StopIterationis unreachable. When the iterator is exhausted,next()raisesStopIterationinside the worker thread, anyio sets it on afuture, and it propagates out of the
run_in_threadpoolcoroutine frame.Python (PEP 479) forbids
StopIterationescaping a coroutine and converts itto
RuntimeError("coroutine raised StopIteration")before theexceptclauseever sees it.
Affected versions: Regression introduced in v2.2.0 — the route and the
buggy adapter were added in the same commit. Not present in the v1.x line
(verified absent at v1.35.0).
Solution:
Stop relying on
StopIterationcrossing the await boundary; use a sentineldefault so iterator exhaustion never raises across it:
This is the minimal, idiomatic fix; the stream now terminates cleanly when the
sync generator is exhausted.
Testing Plan
Unit Tests:
Added
test_gemini_stream_reasoning_engine_sync_generatorplus atest_app_with_gemini_enterprise_sync_streamfixture intests/unittests/cli/test_fast_api.py. The pre-existing stream test used anasync generator (the
isasyncgenfunctionbranch) and never exercised thebuggy sync-generator path. The new test fails on the unpatched code with
RuntimeErrorand passes with the fix.pytest summary:
Manual End-to-End (E2E) Tests:
The failure and the fix reproduce standalone in ~15 lines, independent of any
model or deployment:
With the sentinel version above, the same script prints the two chunks and
exits cleanly with no exception. Originally observed on a live Vertex AI Agent
Engine deployment (google-adk==2.2.0, Python 3.11) where every
stream_querycall logged the RuntimeError after the final chunk.
Checklist
Additional context
Original server traceback:
Occurs 100% of the time on every sync streaming request once the generator is
exhausted. The bug is model-agnostic (purely in the FastAPI streaming adapter).