Skip to content

Unexpected Behavior in Multi-Agent System using OpenAI Agent SDK #1430

@dauvannam1804

Description

@dauvannam1804

Please read this first

  • Have you read the docs? Agents SDK docs
  • Have you searched for related issues? Yes, I have checked related discussions Issue #256

Summary

I’m building a multi-agent system using the OpenAI Agent SDK based on the multi-agent portfolio collaboration example.

My setup includes:

  • Orchestrator Agent: routes user questions to the appropriate agent.
  • Sales Agent: searches products or compares two products.
  • Order Agent: creates new orders or checks order status.

The Sales Agent and Order Agent use MCP tools exposed via @mcp.tool():

@mcp.tool()
def search_products(query: str) -> list:
    ...

@mcp.tool()
def create_order(user_name: str, address: str, product_id: int, amount: int) -> dict:
    ...

@mcp.tool()
def check_order_status(order_id: int) -> dict:
    ...

Problem Description

When testing my system:

  1. Using "agent as tool" approach → I encounter the following issue:
    Result
Image Image
  1. Using "handoff" between agents → I get a different error:
    Result
Image Image

I’m trying to determine whether the issue is caused by:

  • My prompt design
    • orchestrator

      **Role:** E-commerce routing agent.
      
      **Task:** Route user queries to `sales_agent` or `order_agent`.
      
      **Routing Logic:**
      - **Sales queries (products, search):** -> `sales_agent`: Find a product or compare products
      - **Order queries (placing, status):** -> `order_agent`: Place a new order or check an existing one.
      
      **Rules:**
      - Never answer directly.
      - If intent is unclear, state inability to assist.
      
      **Tools:**
      - `sales_agent(query: str)`
      - `order_agent(query: str)`
    • order agent

      **Role:** Order management assistant.
      
      **Task:** Help users place and check orders.
      
      **Workflow:**
      1. **Determine Intent:** Place a new order or check an existing one.
      2. **Place Order:**
         - Collect `user_name`, `address`, `product_id`, and `amount`.
         - If info is missing, ask for it.
         - Use `create_order` tool when all info is present.
         - Present `order_id` and `status` to the user.
      3. **Check Order:**
         - Ask for `order_id`.
         - Use `check_order_status` tool.
         - Present order status to the user.
      
      **Rules:**
      - Only call `create_order` with all required information.
      
      **Tools:**
      - `create_order(user_name: str, address: str, product_id: int, amount: int) -> dict`
      - `check_order_status(order_id: int) -> dict`
    • sales agent

      **Role:** E-commerce sales assistant.
      
      **Task:** Help users find and compare products.
      
      **Workflow:**
      1. **Determine Intent:** Find a product or compare products.
      2. **Find Product:**
         - Identify the category: `Laptop`, `Smartphone`, `Tablet`, `Smartwatch`, `Accessory`.
         - If ambiguous, ask for clarification.
         - Use `search_products` with the exact category name.
         - Present `name` and `price` of the products.
      3. **Compare Products:**
         - Provide a summary of key differences based on available information.
      
      **Rules:**
      - Only use `search_products` with a valid category.
      - Do not use `search_products` for comparison questions.
      
      **Tools:**
      - `search_products(query: str)`: Query must be one of ["Laptop", "Smartphone", "Tablet", "Smartwatch", "Accessory"].
  • The model I’m using gemini-2.5-flash-lite
  • Or some limitation/bug in the current Agent SDK handling of multi-agent handoffs.

Expected Behavior

Both approaches should route tasks and invoke MCP tools correctly without unexpected failures or missing context.

Environment

  • OpenAI Agent SDK version: 0.2.3
  • Python version: 3.10.18
  • OS: Ubuntu 22.04
  • Model used: gemini-2.5-flash-lite

Additional Context

I can share more if needed for debugging.


Question

  • How can I verify that the orchestrator is correctly handing off tasks to the appropriate agents?
  • How can I ensure that the agents are correctly invoking the intended MCP tools during handoff?
  • Is the issue caused by prompt design or model choice?
  • Are there known limitations or more best practices for using handoff in multi-agent setups with MCP tools?
  • Should "agent as tool" behave differently from "handoff" in this scenario, and if so, how?

Appreciate any help or guidance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions