[Bug]: Gen AI Eval SDK run_inference() fails when evaluating Multi-Agent

### File Name

https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/evaluation/create_agent_and_run_evaluation.ipynb

### What happened?

## Summary

The `client.evals.run_inference()` method in the Gen AI Evaluation SDK fails to evaluate 
agents that use Multi-Agent architecture.

## Environment

- **SDK Version**: google-cloud-aiplatform[evaluation] (latest)
- **Agent Framework**: Google ADK (Agent Development Kit)
- **Agent Structure**: Multi-Agent with sub_agents (root_agent → search_agent, reservation_agent)
- **Tools**: MCP Toolbox via `toolbox_core` library
- **Deployment**: Vertex AI Agent Engine (Reasoning Engine)
- **Region**: us-central1

## Steps to Reproduce

1. Deploy a Multi-Agent using ADK with the following structure:
   mcp_toolset = McpToolset(toolbox_server_url=TOOLBOX_SERVER_URL)
   
   search_agent = Agent(
       name="search_agent",
       tools=[analyze_text, get_current_datetime, mcp_toolset],
       ...
   )
   
   root_agent = Agent(
       name="test",
       sub_agents=[reservation_agent, search_agent],
       ...
   )
   2. Run evaluation using Gen AI Eval SDK:
   agent_dataset_with_inference = client.evals.run_inference(
       agent=AGENT_RESOURCE,
       src=agent_dataset,
   )
   3. The evaluation fails with a parsing error.

## Expected Behavior

The SDK should successfully run inference on the deployed Multi-Agent and collect 
`intermediate_events` and `response` for evaluation.

## Actual Behavior

The SDK returns an error:
{
  "error": "Failed to parse agent run response [...] to intermediate events and final response: 'text'"
}## Root Cause (from Agent Engine logs)

The actual error in Agent Engine logs shows an asyncio context issue:


### Relevant log output

```shell
{"error": "Failed to parse agent run response [{'model_version': 'gemini-2.5-flash', 'content': {'parts': [{'function_call': {'id': 'adk-946f20ca-e878-4870-8b11-216eeadb4fec', 'args': {'agent_name': 'search_agent'}, 'name': 'transfer_to_agent'}, 'thought_signature': 'Cp4HAY89a1_kpY2YcICNaaeVDSipHZ5pI2OCglzkjVDUcO-pvOVTjIAHXZooc99UnWWf8DQG_qRgh5BOePI8u08PT8y54U5uPLUhJUbrWt4zVxsOJCUW9_XgIJCONckbqj09olhLwQzX_lGATHRjmFqY9r2IAdI6HgzuUvNGR_Ol7mVC3rgHO83WGkDYhhF85_fl_zjH1wt9ju37r2OMEr-wJOOOWs0TZk8ELgxLqtdZi4tubuCOw2gbBKq3x-Pxp5jONSiSKKuvDR-SR1Dsaogvt-yr2Lqu2X3a1YV-4CnsCbfDVh40OTzCx-mrSURYvuQNdpNO08iIiUXoBxqc5NG5QBjzvXhYJrixVSly4aeI1HnYWnXXGr0MNx9fSL7h2uakRwsd1mCffYWAt3zSWP3bw1W44c7zb4AlwnBb27s5yeEiHYFgyjG0PT9S4JRBgseaQbewd9wktHEtGY461kYuhgNUEYF9-MvU0c2EsJIX7rd4cxT1KnOsRk-coBAEbwX450CWkKzyfE3t0_zoArG3uJm4WU4nL7Xfj2d_tmXvX-18xS4O6P5FtDmhpwxwPB1AW1vbi6xHx6XSi9hKcTHFki_R78ABIvBfXjox4rkGn12BI6sbPRMMtlUQoNNo82UinlLlCKxhwI46sQhCNb3T_i4BeXb43rdxp7HTg6iDY7GfwTcqXSrx6gXTvbl6jR9qAFlHvvmdgU9g26GSMltA_PfouuAMh0ku7jVdmxyE9e_1ToTp3ByGVhvJQhOJtuupDkmtBzbxlE5cffdh_TOwOOItxIMAtN3h2gKdvUlUT33VXiZnxDO89ezOGqMD4ILljDtUg6ikdMb4AOSDVyl_iAFG3Z98yO7A3dq9G5_GjyBFFRh9xmSjlNGGt1oBCYbuKYDdYC7gE9KEOA5f3rJ6hs0V1k8aqo9FPtYresNVZX5jxLa_K_u6TLgsjBptVw7BC72DcFR7f0J78YwuwjKGn2ZynfwnNYq099pzWSyq9GurMmA7gxl2MunxlBnRO4nODjFAi30Mp3IwKBdgeeqm0vmmduky5qEw8iBgFPIWM4hBqc_k380UOcUXoRSEK4Oi5Aqlcxfkyaz0xD8KxS9iPaiu0Dom0pkIH7Jh4vywIuA7KWU9_hNP94RvaM19CzcwefXscKFi17O9nkCHi5_QCeWnqXWdW5YPhvbynUx3KCGDq8KSgSLWg5E5xoH62aiTIshKALoNx_QHHcAgWZI='}], 'role': 'model'}, 'finish_reason': 'STOP', 'usage_metadata': {'candidates_token_count': 11, 'candidates_tokens_details': [{'modality': 'TEXT', 'token_count': 11}], 'prompt_token_count': 1215, 'prompt_tokens_details': [{'modality': 'TEXT', 'token_count': 1215}], 'thoughts_token_count': 215, 'total_token_count': 1441, 'traffic_type': 'ON_DEMAND'}, 'avg_logprobs': -4.73370985551314, 'invocation_id': 'e-5814a0db-4e23-4ab3-85db-74fa2f0224f5', 'author': 'catchtable', 'actions': {'state_delta': {}, 'artifact_delta': {}, 'requested_auth_configs': {}, 'requested_tool_confirmations': {}}, 'long_running_tool_ids': [], 'id': 'ea958c9f-339d-47e0-a5a3-0ca83f666a3f', 'timestamp': 1767945596.923423}, {'content': {'parts': [{'function_response': {'id': 'adk-946f20ca-e878-4870-8b11-216eeadb4fec', 'name': 'transfer_to_agent', 'response': {'result': None}}}], 'role': 'user'}, 'invocation_id': 'e-5814a0db-4e23-4ab3-85db-74fa2f0224f5', 'author': 'catchtable', 'actions': {'state_delta': {}, 'artifact_delta': {}, 'transfer_to_agent': 'search_agent', 'requested_auth_configs': {}, 'requested_tool_confirmations': {}}, 'id': '806afc44-3b0b-431f-be3d-4a1bfc0ac8e0', 'timestamp': 1767945598.723608}] to intermediate events and final response: 'text'"}
```

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Gen AI Eval SDK run_inference() fails when evaluating Multi-Agent #2578

File Name

What happened?

Summary

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Relevant log output

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Gen AI Eval SDK run_inference() fails when evaluating Multi-Agent #2578

Description

File Name

What happened?

Summary

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Relevant log output

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions