Skip to content

[Feature] Multi Turn Trajectory Aware Agents #9243

@BowerJames

Description

@BowerJames

What feature would you like to see?

# Feature

I think DSPY is incredible. The biggest issue I face with using it is many of the agents I create need to be multi stage, need to be aware of past turn trajectories and need to make optimal use of cached token pricing from providers.

It is possible that this request is not inline with the vision of DSPY but in case it is I think it would really bring DSPY to the next level if one could easily create a module that followed the traditional agentic loop that most model are trained to follow.

Motivation and Example

I will explain the motivation by imagining one wanted to create coding agent using DSPY. Lets say all the relevant tooling is setup already. In this scenario I want to make a coding planner agent that is provided with the AGENTS.md file as a system inputs and then is sent the task, current plan and user message as the turn messages. It will then output an updated plan and a response to the user.

Importantly the user may want to provide feedback so that it can update the plan without it having to reread all the files it read on the first turn and whilst maintaining the llm token cache for cheaper inference costs.

I am envisioning something like the following:

import dspy

TOOLS: list[dspy.Tool] = ... # This would be a list of tools that the agent can use.

class PlannerSignature(dspy.Signature):
    """
    Create and or update a plan for a given task.
    """
    # System message fields
    # This brings in the concepts of a system input. 
    # These are inputs that are passed in to the module but only appear injected into the system message as with tools.
    # This means they would not be expected to be included in the history.
    # This would be very useful for token management in generalised agents.
    # An AGENTS.md file I think is a good prime example of this.
    agents_md: str = dspy.InputField(description="The markdown description of how to interact with the codebase.", location="system") 
    tools: list[dspy.Tool] = dspy.InputField(location="system")

    # User message Input fields
    task: str = dspy.InputField(description="The task to create a plan for")
    current_plan: str | None = dspy.InputField(description="The current plan for the task", default=None)
    user_message: str = dspy.InputField(description="A message from the the user. This could be a request for a change to the plan, or a request for more information.")
    
    # Output fields
    tool_calls: list[dspy.ToolCall] = dspy.OutputField()
    response: str = dspy.OutputField(description="The response to the user's message.")
    updated_plan: str = dspy.OutputField(description="The updated plan for the task")

    # History fields
    # The history would be a list of objects with the following fields:
    # - task
    # - current_plan
    # - user_message
    # - tool_calls
    # - response
    # - updated_plan
    history: dspy.History = dspy.InputField()

planner_agent = dspy.MultiTurnAgent(PlannerSignature, tools=TOOLS)

### TURN 1 ###
task = "Update the README.md to include and overview of module.py"
current_plan = None
agents_md = ... # This would be the markdown description of the agent.
user_message = "Can you create the plan for this task?"
history = []

response = planner_agent(task=task, current_plan=current_plan, user_message=user_message, agents_md=agents_md, history=history)
history.append(
    {
        "task": task,
        "current_plan": current_plan,
        "user_message": user_message,
        "tool_calls": response.tool_calls,
        "response": response.response,
        "updated_plan": response.updated_plan,
    }
)

# response.tool_calls -> [
#   dspy.ToolCall(tool_name="read_file", args={"file_path": "README.md"}, result="... contents of README.md ..."),
#   dspy.ToolCall(tool_name="read_file", args={"file_path": "module.py"}, result="... contents of module.py ..."),
#]
# response.response -> "Sure, I have created a plan for the task."
# response.updated_plan -> "... initial plan ...""

### TURN 2 ###

current_plan = response.updated_plan
user_message = "Actually can you add details on the signature found in signature.py"

response = planner_agent(task=task, current_plan=current_plan, user_message=user_message, agents_md=agents_md, history=history)
history.append(
    {
        "task": task,
        "current_plan": current_plan,
        "user_message": user_message,
        "tool_calls": response.tool_calls,
        "response": response.response,
        "updated_plan": response.updated_plan,
    }
)

# response.tool_calls -> [
#   dspy.ToolCall(tool_name="read_file", args={"file_path": "signature.py"}, result="... contents of signature.py ..."),
#]
# response.response -> "Sure, I have updated the plan to include details on the signature found in signature.py"
# response.updated_plan -> "... updated plan ...""

### Native Tool Calling ###

{
    "tools": TOOLS,
    "messages": [
        {
            "role": "system",
            "content": (
                "... custom dspy instructions here ...\n\n"
                "### The markdown description of how to interact with the codebase. ###\n"
                "... agents_md ..."
            )
        },
        {
            "role": "user",
            "content": (
                "### task ###"
                "\nUpdate the README.md to include and overview of module.py"
                "\n\n### current_plan ###"
                "\nnone"
                "\n\n### user_message ###"
                "\nCan you create the plan for this task?"
            )
        },
        {
            "role": "assistant",
            "content": None,
            "tool_calls": [
                {
                    "id": "call_1",
                    "type": "function",
                    "arguments": {
                        "file_path": "README.md"
                    }
                }
            ]
        },
        {
            "role": "tool",
            "tool_call_id": "call_1",
            "content": "... contents of README.md ..."
        },
        {
            "role": "assistant",
            "content": None,
            "tool_calls": [
                {
                    "id": "call_2",
                    "type": "function",
                    "arguments": {
                        "file_path": "module.py"
                    }
                }
            ]
        },
        {
            "role": "tool",
            "tool_call_id": "call_2",
            "content": "... contents of module.py ..."
        },
        {
            "role": "assistant",
            "content": (
                "### updated_plan ###"
                "\n... initial plan ..."
                "\n\n### response ###"
                "\nSure, I have created a plan for the task."
            )
        },
        {
            "role": "user",
            "content": (
                "### task ###"
                "\nUpdate the README.md to include and overview of module.py"
                "\n\n### current_plan ###"
                "\n... initial plan ..."
                "\n\n### user_message ###"
                "\nActually can you add details on the signature found in signature.py"
            )
        },
    ]
}

### No Native Tool Calling ###

{
    "messages": [
        {
            "role": "system",
            "content": (
                "... custom dspy instructions here with tool info ...\n\n"
                "### The markdown description of how to interact with the codebase. ###\n"
                "... agents_md ..."
            )
        },
        {
            "role": "user",
            "content": (
                "### task ###"
                "\nUpdate the README.md to include and overview of module.py"
                "\n\n### current_plan ###"
                "\nnone"
                "\n\n### user_message ###"
                "\nCan you create the plan for this task?"
            )
        },
        {
            "role": "assistant",
            "content": (
                "### next_tool_name ###"
                "\nread_file"
                "\n\n### next_tool_args ###"
                "\n{ \"file_path\": \"README.md\" }"
            ),
        },
        {
            "role": "user",
            "content": (
                "### tool_result ###"
                "\n... contents of README.md ..."
            )
        },
        {
            "role": "assistant",
            "content": (
                "### next_tool_name ###"
                "\nread_file"
                "\n\n### next_tool_args ###"
                "\n{ \"file_path\": \"module.py\" }"
            ),
        },
        {
            "role": "user",
            "content": (
                "### tool_result ###"
                "\n... contents of module.py ..."
            )
        },
        {
            "role": "assistant",
            "content": (
                "### updated_plan ###"
                "\n... initial plan ..."
                "\n\n### response ###"
                "\nSure, I have created a plan for the task."
            )
        },
        {
            "role": "user",
            "content": (
                "### task ###"
                "\nUpdate the README.md to include and overview of module.py"
                "\n\n### current_plan ###"
                "\n... initial plan ..."
                "\n\n### user_message ###"
                "\nActually can you add details on the signature found in signature.py"
            )
        },
    ]
}

There are two core differences in this:

  1. The concept of a system input which is an input field that is only placed ones in the system message since it is likely to remain constant across the length of the session and you don't want to keep repeating the tokens for no reason.
  2. The history is well integrated into agent module so that multiturn interaction can be had with the agent and the llm will remember it's past actions and the adapter can format this in a way that is sensible to the model.

I appreciate that DSPY is designed to be useful for use with inference providers that support native tool calling and ones that don't. With this in mind I have added an example of how I see the inputs being formatted for the LLM in the Open AI api format both with and without the use of native tool calling.

Importantly, this can work directly with a history module to enable multi turn DSPY agent loops. I appreciate there may still be an open question on the transition from the assistants message being an outputted tool call to it being the dspy signature response when native tool calling does not exist. I welcome any thoughts on the best way to do this whilst maintaining optimal use of token caching.

There is also and open question on whether there is a natural way to incorporate the intermittent reasoning output like the current dspy.React agent does that satisfies both with and without native tool calling.

Note

I welcome your feedback. I have ticked "No, I just want to request it." as I am not sure I have a deep enough understanding of dspy to do this appropriately (although I have tried to do my research and see if there is a nice way to do this with what is currently there) but I am willing to give this a go if it is desirable.

Also if I have missed anything, and the requirements I outline are currently possible in dspy I apologise and would appreciate someone outlining how.

Would you like to contribute?

  • Yes, I'd like to help implement this.
  • No, I just want to request it.

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions