Skip to content

Commit c15c885

Browse files
authored
Python: Emit partial result for magentic pattern when retrieving final result, if available (#12656)
### Motivation and Context When using the magentic orchestration pattern, if the `max_round_count` is hit, the final result shows as: ``` Final result: Max round count reached. ``` The interim messages do show as part of the `agent_response_callback`; however, not everyone may have that configured. We should return more meaningful results, even if partial. This PR updates to try and get a partial result if it exists. When one calls: ```python value = await orchestration_result.get() ``` for the `step5_magentic.py` sample, with `max_round_count=1` they should receive the partial result: ``` Final result: Based on the available data, here is a comparison of the estimated training and inference energy consumption for ResNet-50, BERT-base, and GPT-2, along with the associated CO₂ emissions when training on an Azure Standard_NC6s_v3 VM for 24 hours. **Model Architectures and Datasets:** - **ResNet-50**: Image classification model trained on ImageNet. - **BERT-base**: Text classification model fine-tuned on the GLUE benchmark. - **GPT-2**: Text generation model trained on WebText. ... <rest omitted for brevity> ... ``` <!-- Thank you for your contribution to the semantic-kernel repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> ### Description Return partial results for magentic orchestration if they exist. - Closes #12625 <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [X] The code builds clean without any errors or warnings - [X] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [X] All unit tests pass, and I have added new tests where possible - [X] I didn't break anyone 😄
1 parent ba000c3 commit c15c885

File tree

3 files changed

+60
-44
lines changed

3 files changed

+60
-44
lines changed

python/semantic_kernel/agents/orchestration/magentic.py

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -647,19 +647,32 @@ async def _check_within_limits(self) -> bool:
647647
if self._context is None:
648648
raise RuntimeError("The Magentic manager is not started yet. Make sure to send a start message first.")
649649

650-
if (
650+
hit_round_limit = (
651651
self._manager.max_round_count is not None and self._context.round_count >= self._manager.max_round_count
652-
) or (self._manager.max_reset_count is not None and self._context.reset_count > self._manager.max_reset_count):
653-
message = (
654-
"Max round count reached."
655-
if self._manager.max_round_count and self._context.round_count >= self._manager.max_round_count
656-
else "Max reset count reached."
652+
)
653+
hit_reset_limit = (
654+
self._manager.max_reset_count is not None and self._context.reset_count > self._manager.max_reset_count
655+
)
656+
657+
if hit_round_limit or hit_reset_limit:
658+
limit_type = "round" if hit_round_limit else "reset"
659+
logger.debug(f"Max {limit_type} count reached.")
660+
661+
# Retrieve the latest assistant content produced so far
662+
partial_result = next(
663+
(m for m in reversed(self._context.chat_history.messages) if m.role == AuthorRole.ASSISTANT),
664+
None,
657665
)
658-
logger.debug(message)
659-
if self._result_callback:
660-
await self._result_callback(
661-
ChatMessageContent(role=AuthorRole.ASSISTANT, content=message, name=self.__class__.__name__)
666+
if partial_result is None:
667+
partial_result = ChatMessageContent(
668+
role=AuthorRole.ASSISTANT,
669+
content=f"Stopped because the maximum {limit_type} limit was reached. No partial result available.",
670+
name=self.__class__.__name__,
662671
)
672+
673+
if self._result_callback:
674+
await self._result_callback(partial_result)
675+
663676
return False
664677

665678
return True

python/tests/unit/agents/orchestration/test_magentic.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -425,7 +425,8 @@ async def test_invoke_with_max_round_count_exceeded():
425425
finally:
426426
await runtime.stop_when_idle()
427427

428-
assert result.content == "Max round count reached."
428+
# Partial result will be returned when max round count is exceeded.
429+
assert result.content == mock_get_chat_message_content.return_value.content
429430
assert mock_invoke_stream.call_count == 1
430431
# Planning will be called once, so the facts and plan will be created once.
431432
assert mock_get_chat_message_content.call_count == 2
@@ -472,7 +473,9 @@ async def test_invoke_with_max_reset_count_exceeded():
472473
finally:
473474
await runtime.stop_when_idle()
474475

475-
assert result.content == "Max reset count reached."
476+
# Partial result will be returned when max reset count is exceeded. The test emits content based on the prompt
477+
# so check that the content is not None and not an exact match to a mock response.
478+
assert result.content is not None
476479
assert mock_invoke_stream.call_count == 1
477480
# Planning and replanning will be each called once, so the facts and plan will be created twice.
478481
assert mock_get_chat_message_content.call_count == 4

0 commit comments

Comments
 (0)