-
Notifications
You must be signed in to change notification settings - Fork 100
Description
Is your feature request related to a problem? Please describe.
I need to benchmark OpenAI's Responses API, which is integrated into LlamaStack agentic workflows with tool calling (RAG, MCP, function calling, etc.). The full list of available tools can be found here.
Currently, GuideLLM does not support OpenAI's Responses API and we can't execute performance tests for agentic systems built on the Responses API, which are becoming increasingly important as organizations move beyond simple chat to tool-enabled AI agents.
The Responses API has different request/response structures and multi-step execution patterns that aren't captured by existing completion endpoints.
Describe the solution you'd like
Send Responses API requests with proper format:
The tools array supports different tool types, each with its own structure. For example:
file_search (RAG):
`vector_store_id` is a unique identifier for a document collection in LlamaStack's vector database. You get it by calling the `/v1/vector_stores` endpoint which creates a new vector store and returns its ID (like vs_xyz789). After uploading documents to this vector store, you reference it by this ID in `file_search` tool requests - LlamaStack knows where to find the embedded documents because it manages the vector store internally.
curl -X POST http://llamastack:8321/v1/openai/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "vllm-inference/llama-32-3b-instruct",
"input": "When did the Bering Land Bridge become a national preserve?",
"tools": [{
"type": "file_search",
"vector_store_ids": ["vs_xyz789"]
}],
"stream": false
}'
MCP (Model Context Protocol):
MCP server URL is the network address where your MCP server is running. You get it by deploying an MCP server (either as a standalone process, container, or Kubernetes pod/service), which exposes an HTTP endpoint for MCP protocol communication.
curl -X POST http://llamastack:8321/v1/openai/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "vllm-inference/llama-32-3b-instruct",
"input": "Search for parks in RI using state code RI",
"tools": [{
"type": "mcp",
"server_url": "http://nps-mcp-server:3005/sse",
"server_label": "National Parks"
}],
"stream": false
}'
Parse Responses API responses with multi-item outputs.
Here's the full response we received when testing with the MCP tool:
{
"created_at": 1762867562,
"error": null,
"id": "resp_e848b7b9-615c-4cb3-8286-4d00d099de92",
"model": "vllm-inference/llama-32-3b-instruct",
"object": "response",
"output": [
{
"id": "mcp_list_75ac2d61-39a3-4b6d-bed1-6c7a7ae8e140",
"type": "mcp_list_tools",
"server_label": "National Parks",
"tools": [
{
"input_schema": {
"properties": {
"state_code": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null},
"park_code": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null},
"query": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null},
"limit": {"default": 10, "type": "integer"}
},
"type": "object"
},
"name": "search_parks",
"description": "Search for national parks by state, park code, or query string.\n\nArgs:\n state_code: Two-letter state code (e.g., 'CA', 'NY')\n park_code: Four-letter park code (e.g., 'yell', 'acad')\n query: Search query for park names or descriptions\n limit: Maximum number of results to return (default: 10)\n\nReturns:\n JSON string with park information including name, description, website, and location"
},
{
"input_schema": {
"properties": {
"park_code": {"type": "string"}
},
"required": ["park_code"],
"type": "object"
},
"name": "get_park_alerts",
"description": "Get current alerts for a specific national park.\n\nArgs:\n park_code: Four-letter park code (e.g., 'yell', 'acad', 'grca')\n\nReturns:\n JSON string with current alerts for the park"
},
{
"input_schema": {
"properties": {
"park_code": {"type": "string"},
"limit": {"default": 10, "type": "integer"}
},
"required": ["park_code"],
"type": "object"
},
"name": "get_park_campgrounds",
"description": "Get campground information for a specific national park.\n\nArgs:\n park_code: Four-letter park code (e.g., 'yell', 'acad', 'grca')\n limit: Maximum number of campgrounds to return (default: 10)\n\nReturns:\n JSON string with campground information including location, amenities, and fees"
},
{
"input_schema": {
"properties": {
"park_code": {"type": "string"},
"limit": {"default": 10, "type": "integer"}
},
"required": ["park_code"],
"type": "object"
},
"name": "get_park_events",
"description": "Get upcoming events for a specific national park.\n\nArgs:\n park_code: Four-letter park code (e.g., 'yell', 'acad', 'grca')\n limit: Maximum number of events to return (default: 10)\n\nReturns:\n JSON string with event information including date, time, fee, and description"
},
{
"input_schema": {
"properties": {
"park_code": {"type": "string"},
"limit": {"default": 10, "type": "integer"}
},
"required": ["park_code"],
"type": "object"
},
"name": "get_visitor_centers",
"description": "Get visitor center information for a specific national park.\n\nArgs:\n park_code: Four-letter park code (e.g., 'yell', 'acad', 'grca')\n limit: Maximum number of visitor centers to return (default: 10)\n\nReturns:\n JSON string with visitor center information including location, contact, and operating hours"
}
]
},
{
"id": "fc_14297065-c937-4dca-9131-06155f18fdb6",
"type": "mcp_call",
"arguments": "{\"state_code\": \"RI\"}",
"name": "search_parks",
"server_label": "National Parks",
"error": null,
"output": "{\n \"total\": \"4\",\n \"parks\": [\n {\n \"name\": \"Blackstone River Valley National Historical Park\",\n \"code\": \"blrv\",\n \"description\": \"The Blackstone River powered America's entry into the Age of Industry...\",\n \"website\": \"https://www.nps.gov/blrv/index.htm\",\n \"states\": \"RI,MA\",\n \"designation\": \"National Historical Park\",\n \"latitude\": \"41.8775792792\",\n \"longitude\": \"-71.382433945\"\n },\n {\n \"name\": \"Roger Williams National Memorial\",\n \"code\": \"rowi\",\n \"description\": \"...Banished by the English and saved by the First Peoples, Roger Williams founded Providence here in 1636.\",\n \"website\": \"https://www.nps.gov/rowi/index.htm\",\n \"states\": \"RI\",\n \"designation\": \"National Memorial\",\n \"latitude\": \"41.8298955\",\n \"longitude\": \"-71.41056665\"\n },\n {\n \"name\": \"Touro Synagogue National Historic Site\",\n \"code\": \"tosy\",\n \"description\": \"Touro Synagogue, a building of exquisite beauty and design...\",\n \"website\": \"https://www.nps.gov/tosy/index.htm\",\n \"states\": \"RI\",\n \"designation\": \"National Historic Site\",\n \"latitude\": \"41.4893\",\n \"longitude\": \"-71.3121\"\n },\n {\n \"name\": \"Washington-Rochambeau Revolutionary Route National Historic Trail\",\n \"code\": \"waro\",\n \"description\": \"...\",\n \"website\": \"https://www.nps.gov/waro/index.htm\",\n \"states\": \"MA,RI,CT,NY,NJ,PA,DE,MD,VA,DC\",\n \"designation\": \"National Historic Trail\",\n \"latitude\": \"40.0958204557\",\n \"longitude\": \"-74.8563515109\"\n }\n ]\n}"
},
{
"content": [
{
"text": "There are four national parks in Rhode Island: Blackstone River Valley National Historical Park, Roger Williams National Memorial, Touro Synagogue National Historic Site, and Washington-Rochambeau Revolutionary Route National Historic Trail. You can find more information about each park, including their descriptions, websites, and locations, on the National Park Service website.",
"type": "output_text",
"annotations": []
}
],
"role": "assistant",
"type": "message",
"id": "msg_714a2b1e-c5ec-48cd-bfe2-6afb7522f626",
"status": "completed"
}
],
"parallel_tool_calls": false,
"previous_response_id": null,
"status": "completed",
"temperature": null,
"text": {"format": {"type": "text"}},
"top_p": null,
"tools": [{"type": "mcp", "server_label": "National Parks", "allowed_tools": null}],
"truncation": null,
"usage": {
"input_tokens": 3033,
"output_tokens": 88,
"total_tokens": 3121,
"input_tokens_details": null,
"output_tokens_details": null
}
}
Track Responses-specific metrics:
- Output items count (number of steps in response)
- Tool calls executed (count of
mcp_call/file_search_callitems) - Tools discovered (number of tools from
mcp_list_tools) - Response completion status (completed/failed/incomplete rate)
CLI interface similar to existing:
# Basic Responses API (no tools)
guidellm benchmark \
--target http://llamastack:8321/v1/openai \
--model vllm-inference/llama-32-3b-instruct \
--request-type responses \
--data questions.jsonl \
--rate-type concurrent \
--rate 128
# With MCP tools
guidellm benchmark \
--request-type responses \
--mcp-server-url http://nps-mcp-server:3005/sse \
--mcp-server-label "National Parks" \
...
# With RAG/file_search
guidellm benchmark \
--request-type responses \
--vector-store-ids vs_abc123,vs_def456 \
...
Describe alternatives you've considered
Manually send curl requests and time them - Simple for basic testing but doesn't scale. Can't easily test concurrent requests, generate load profiles, or collect detailed metrics. Not suitable for production performance validation.