Skip to content

Commit 3b997a0

Browse files
hangfeicopybara-github
authored andcommitted
fix: change LlmResponse to use Content for transcriptions
The transcription change breaks the multi-agent transfer during live/bidi. Updates `GeminiLlmConnection` to populate the `content` field of `LlmResponse` with `types.Content` and `types.Part` objects for both input and output transcriptions, instead of using dedicated transcription fields. Also removes a debug print from `audio_cache_manager.py`. the transcription is not fully ready to be used yet so roll back the transcription change. PiperOrigin-RevId: 799851950
1 parent bcf0dda commit 3b997a0

File tree

2 files changed

+13
-3
lines changed

2 files changed

+13
-3
lines changed

src/google/adk/flows/llm_flows/audio_cache_manager.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,6 @@ async def _flush_cache_to_services(
141141
Returns:
142142
True if the cache was successfully flushed, False otherwise.
143143
"""
144-
print('flush cache')
145144
if not invocation_context.artifact_service or not audio_cache:
146145
logger.debug('Skipping cache flush: no artifact service or empty cache')
147146
return False

src/google/adk/models/gemini_llm_connection.py

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -164,8 +164,14 @@ async def receive(self) -> AsyncGenerator[LlmResponse, None]:
164164
message.server_content.input_transcription
165165
and message.server_content.input_transcription.text
166166
):
167+
user_text = message.server_content.input_transcription.text
168+
parts = [
169+
types.Part.from_text(
170+
text=user_text,
171+
)
172+
]
167173
llm_response = LlmResponse(
168-
input_transcription=message.server_content.input_transcription,
174+
content=types.Content(role='user', parts=parts)
169175
)
170176
yield llm_response
171177
if (
@@ -180,8 +186,13 @@ async def receive(self) -> AsyncGenerator[LlmResponse, None]:
180186
# We rely on other control signals to determine when to yield the
181187
# full text response(turn_complete, interrupted, or tool_call).
182188
text += message.server_content.output_transcription.text
189+
parts = [
190+
types.Part.from_text(
191+
text=message.server_content.output_transcription.text
192+
)
193+
]
183194
llm_response = LlmResponse(
184-
output_transcription=message.server_content.output_transcription
195+
content=types.Content(role='model', parts=parts), partial=True
185196
)
186197
yield llm_response
187198

0 commit comments

Comments
 (0)