-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Issue Description:
Problem: APO's TraceToMessages adapter crashes when processing traces from multi-turn agent rollouts that make multiple sequential LLM calls.
Use Case:
I'm training an agent that uses multiple tool calls across several turns to answer a single query:
@agl.rollout
def trainable_agent(task: Dict[str, Any], prompt_template) -> float:
messages = [{"role": "system", "content": prompt}, {"role": "user", "content": query}]
all_tools_called = []
Multi-turn loop
for turn in range(max_turns):
response = client.chat.completions.create(model=model, messages=messages, tools=tools)
if not response.choices[0].message.tool_calls:
break
# Append assistant + tool messages to conversation
messages.append(...) # assistant with tool_calls
messages.append(...) # tool results
all_tools_called.append(tool_name)
# Calculate reward based on all tools used
reward = calculate_reward(all_tools_called, ground_truth_tools)
return reward
ERROR:
KeyError: 'role'
File "agentlightning/adapter/messages.py", line 109, in _extract_messages
role = msg["role"]
Root Cause:
AgentOps captures multiple spans (one per LLM call) in a single rollout
Each subsequent span includes the full conversation history from previous turns
TraceToMessages expects single prompt→completion per rollout
Parser fails on conversation history structure in later spans
Current Workaround:
Created a wrapper that catches the error and returns dummy data, but this means APO doesn't see the actual multi-turn conversation for gradient generation.
Expected Behavior:
APO should support multi-turn rollouts where multiple LLM calls are traced within a single rollout, allowing gradient generation based on the complete multi-turn interaction.