diff --git a/README.md b/README.md index 9a36b0401..c87212420 100644 --- a/README.md +++ b/README.md @@ -23,6 +23,7 @@ Unlike many agent frameworks that only track the chat history with LLMs in text, ## 🆕 News +- 📅2025-01-16: TaskWeaver has been enhanced with an experimental role called [Recepta](https://microsoft.github.io/TaskWeaver/blog/reasoning) for its reasoning power.🧠 - 📅2024-12-23: TaskWeaver has been integrated with the [AgentOps](https://microsoft.github.io/TaskWeaver/docs/observability) for better observability and monitoring.🔍 - 📅2024-09-13: We introduce the shared memory to store information that is shared between the roles in TaskWeaver. Please check the [memory](https://microsoft.github.io/TaskWeaver/docs/memory) for more details.🧠 - 📅2024-09-13: We have enhanced the experience feature by allowing static and dynamic experience selection. Please check the [experience](https://microsoft.github.io/TaskWeaver/blog/experience) for more details.📚 @@ -31,7 +32,7 @@ Unlike many agent frameworks that only track the chat history with LLMs in text, - 📅2024-03-28: TaskWeaver now offers all-in-one Docker image, providing a convenient one-stop experience for users. Please check the [docker](https://microsoft.github.io/TaskWeaver/docs/usage/docker) for more details.🐳 - 📅2024-03-27: TaskWeaver now switches to `container` mode by default for code execution. Please check the [code execution](https://microsoft.github.io/TaskWeaver/docs/code_execution) for more details.🐳 - 📅2024-03-07: TaskWeaver now supports configuration of different LLMs for various components, such as the Planner and CodeInterpreter. Please check the [multi-llm](https://microsoft.github.io/TaskWeaver/docs/llms/multi-llm) for more details.🔗 -- 📅2024-03-04: TaskWeaver now supports a [container](https://microsoft.github.io/TaskWeaver/docs/code_execution) mode, which provides a more secure environment for code execution.🐳 + @@ -54,7 +55,7 @@ Unlike many agent frameworks that only track the chat history with LLMs in text, - [x] **Code verification** - TaskWeaver is designed to verify the generated code before execution. It can detect potential issues in the generated code and provide suggestions to fix them. - [x] **Easy to use** - TaskWeaver is easy to use with sample plugins, examples and tutorials to help you get started. TaskWeaver offers an open-box experience, allowing users to run it immediately after installation. - [x] **Easy to debug** - TaskWeaver is easy to debug with detailed and transparent logs to help you understand the entire process, including LLM prompts, the code generation, and execution process. -- [x] **Security consideration** - TaskWeaver supports a basic session management to keep different users' data separate. The code execution is separated into different processes to avoid mutal interference. +- [x] **Security consideration** - TaskWeaver supports a basic session management to keep different users' data separate. The code execution is separated into different processes to avoid mutual interference. - [x] **Easy extension** - TaskWeaver is easy to extend to accomplish more complex tasks with multiple agents as the plugins. ## ✨ Quick Start diff --git a/project/examples/planner_examples/example-planner-2.yaml b/project/examples/planner_examples/example-planner-2.yaml index 5bd4f1733..ba548245d 100644 --- a/project/examples/planner_examples/example-planner-2.yaml +++ b/project/examples/planner_examples/example-planner-2.yaml @@ -11,7 +11,7 @@ rounds: send_from: Planner send_to: User attachment_list: - - type: reasoning + - type: plan_reasoning content: |- The user greets the Planner - type: init_plan @@ -21,4 +21,6 @@ rounds: content: |- 1. Respond to the user's greeting - type: current_plan_step - content: 1. Respond to the user's greeting \ No newline at end of file + content: 1. Respond to the user's greeting + - type: stop + content: Completed \ No newline at end of file diff --git a/project/examples/planner_examples/example-planner-echo.yaml b/project/examples/planner_examples/example-planner-echo.yaml index 06ffd29b8..f94715e74 100644 --- a/project/examples/planner_examples/example-planner-echo.yaml +++ b/project/examples/planner_examples/example-planner-echo.yaml @@ -11,7 +11,7 @@ rounds: send_from: Planner send_to: Echo attachment_list: - - type: reasoning + - type: plan_reasoning content: |- The user wants to echo the input 'Hello World' - type: init_plan @@ -22,6 +22,8 @@ rounds: 1. Ask Echo to echo the user's input, 'Hello World' - type: current_plan_step content: 1. Ask Echo to echo the user's input, 'Hello World' + - type: stop + content: InProcess - message: Hello World send_from: Echo send_to: Planner @@ -30,7 +32,7 @@ rounds: send_from: Planner send_to: User attachment_list: - - type: reasoning + - type: plan_reasoning content: |- The user query is successfully answered - type: init_plan @@ -41,4 +43,5 @@ rounds: 1. Ask Echo to echo the user's input, 'Hello World' - type: current_plan_step content: 1. Ask Echo to echo the user's input, 'Hello World' - + - type: stop + content: Completed diff --git a/project/examples/planner_examples/example-planner-recepta.yaml b/project/examples/planner_examples/example-planner-recepta.yaml new file mode 100644 index 000000000..d32574006 --- /dev/null +++ b/project/examples/planner_examples/example-planner-recepta.yaml @@ -0,0 +1,137 @@ +enabled: True +rounds: + - user_query: echo 'Generate Employee Report' + state: created + post_list: + - message: |- + please follow the instructions below to complete the task: + + # Task Instructions + ## Step 1: Extract Employee Data + Extract the following information from the employee summary: + - `Employee ID` + - `Department` + - `Project Code` + - `Role` + - `Manager` + - `Location` + + ### Employee Summary + On April 10, 2025, a review was conducted for an employee with ID `E12345`. The employee is part of the `Development` department and is currently assigned to project code `"X987"`. The role held is `"Software Engineer"` under the supervision of manager `"J. Doe."` The primary work location is `"Remote."` + + ## Step 2: Validate the Role + - Confirm if the role is recognized among the following: Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester. + - Check if the role is one of the recognized roles. + + ## Step 3: Construct SQL Query + Prepare and print the SQL query using the extracted details: + ```sql + SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location + FROM EmployeeRecords + WHERE EmployeeID = '{EmployeeID}' + AND Department = '{Department}' + AND ProjectCode = '{ProjectCode}' + AND Role = '{Role}' + AND Manager = '{Manager}' + AND Location = '{Location}' + ORDER BY EmployeeID; + ``` + send_from: User + send_to: Planner + attachment_list: + - message: |- + Thought: I can read and comprehend the employee summary to extract the required information. + Result: the `Employee ID` is `E12345`, the `Department` is `Development`, the `Project Code` is `X987`, the `Role` is `Software Engineer`, the `Manager` is `J. Doe`, and the `Location` is `Remote`. + send_from: Planner + send_to: Recepta + attachment_list: + - type: plan_reasoning + content: |- + The first step is to extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location`. + - type: init_plan + content: |- + 1. extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location` + 2. validate the role + 3. construct and print the SQL query using the extracted details + 4. check the execution result and report the result to the user + - type: plan + content: |- + 1. extract information from the employee summary + 2. validate the role + 3. construct SQL query using the extracted details + 4. report the result to the user + - type: current_plan_step + content: 1. extract information from the employee summary + - type: stop + content: InProcess + - message: Got it! Please go ahead. + send_from: Recepta + send_to: Planner + attachment_list: + - message: |- + Thought: I can verify if the role is one of the recognized roles with my pattern matching capability. + Thought: The recognized roles are `Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester`. + Result: The role is `Software Engineer`, which is recognized. + send_from: Planner + send_to: Recepta + attachment_list: + - type: plan_reasoning + content: |- + I have extracted the information from the employee summary. The next step is to verify if the role is one of the recognized roles. + - type: init_plan + content: |- + 1. extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location` + 2. validate the role + 3. construct and print the SQL query using the extracted details + 4. check the execution result and report the result to the user + - type: plan + content: |- + 1. extract information from the employee summary + 2. validate the role + 3. construct SQL query using the extracted details + 4. report the result to the user + - type: current_plan_step + content: 2. validate the role + - type: stop + content: InProcess + - message: Got it! Please go ahead. + send_from: Recepta + send_to: Planner + attachment_list: + - message: |- + The role is `Software Engineer`, which is recognized. + + Here is the SQL query using the extracted details: + ```sql + SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location + FROM EmployeeRecords + WHERE EmployeeID = 'E12345' + AND Department = 'Development' + AND ProjectCode = 'X987' + AND Role = 'Software Engineer' + AND Manager = 'J. Doe' + AND Location = 'Remote' + ORDER BY EmployeeID; + ``` + send_from: Planner + send_to: User + attachment_list: + - type: plan_reasoning + content: |- + I can prepare and print the SQL query using the extracted details to the User. + - type: init_plan + content: |- + 1. extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location` + 2. validate the role + 3. construct and print the SQL query using the extracted details + 4. check the execution result and report the result to the user + - type: plan + content: |- + 1. extract information from the employee summary + 2. validate the role + 3. construct SQL query using the extracted details + 4. report the result to the user + - type: current_plan_step + content: 3. construct and print the SQL query using the extracted details and report the result to the user + - type: stop + content: Completed \ No newline at end of file diff --git a/project/examples/planner_examples/example-planner.yaml b/project/examples/planner_examples/example-planner.yaml index 68f2cc9d6..1cd2fde0d 100644 --- a/project/examples/planner_examples/example-planner.yaml +++ b/project/examples/planner_examples/example-planner.yaml @@ -11,7 +11,7 @@ rounds: send_from: Planner send_to: CodeInterpreter attachment_list: - - type: reasoning + - type: plan_reasoning content: |- The user wants to count the rows of the data file /home/data.csv. The first step is to load the data file and count the rows of the loaded data. - type: init_plan @@ -25,6 +25,8 @@ rounds: 2. Check the execution result and report the result to the user if it is correct - type: current_plan_step content: 1. Instruct CodeInterpreter to load the data file and count the rows of the loaded data + - type: stop + content: InProcess - message: Load the data file /home/data.csv successfully and there are 100 rows in the data file send_from: CodeInterpreter send_to: Planner @@ -33,7 +35,7 @@ rounds: send_from: Planner send_to: User attachment_list: - - type: reasoning + - type: plan_reasoning content: |- The data file /home/data.csv is loaded and there are 100 rows in the data file The execution result is correct @@ -49,3 +51,5 @@ rounds: 2. Check the execution result and report the result to the user if it is correct - type: current_plan_step content: 2. report the result to the user + - type: stop + content: Completed diff --git a/taskweaver/ces/runtime/executor.py b/taskweaver/ces/runtime/executor.py index e55847275..6359c3ba6 100644 --- a/taskweaver/ces/runtime/executor.py +++ b/taskweaver/ces/runtime/executor.py @@ -173,8 +173,9 @@ def load_lib(self, local_ns: Dict[str, Any]): # customize pandas display options pd.set_option("display.html.table_schema", False) pd.set_option("display.notebook_repr_html", False) - pd.set_option("display.max_rows", 4) + pd.set_option("display.max_rows", 5) pd.set_option("display.expand_frame_repr", False) + pd.set_option("display.max_colwidth", None) local_ns["pd"] = pd except ImportError: self.log( diff --git a/taskweaver/ext_role/recepta/__init__.py b/taskweaver/ext_role/recepta/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/taskweaver/ext_role/recepta/recepta.py b/taskweaver/ext_role/recepta/recepta.py new file mode 100644 index 000000000..1166db66e --- /dev/null +++ b/taskweaver/ext_role/recepta/recepta.py @@ -0,0 +1,44 @@ +from injector import inject + +from taskweaver.logging import TelemetryLogger +from taskweaver.memory import Memory, Post +from taskweaver.module.event_emitter import SessionEventEmitter +from taskweaver.module.tracing import Tracing +from taskweaver.role import Role +from taskweaver.role.role import RoleConfig, RoleEntry + + +class ReceptaConfig(RoleConfig): + def _configure(self): + self.decorator = self._get_str("decorator", "") + + +class Recepta(Role): + @inject + def __init__( + self, + config: ReceptaConfig, + logger: TelemetryLogger, + tracing: Tracing, + event_emitter: SessionEventEmitter, + role_entry: RoleEntry, + ): + super().__init__(config, logger, tracing, event_emitter, role_entry) + + def reply(self, memory: Memory, **kwargs: ...) -> Post: + rounds = memory.get_role_rounds( + role=self.alias, + include_failure_rounds=False, + ) + + # obtain the query from the last round + last_post = rounds[-1].post_list[-1] + + post_proxy = self.event_emitter.create_post_proxy(self.alias) + + post_proxy.update_send_to(last_post.send_from) + post_proxy.update_message( + "Got it. Now, please go ahead!", + ) + + return post_proxy.end() diff --git a/taskweaver/ext_role/recepta/recepta.role.yaml b/taskweaver/ext_role/recepta/recepta.role.yaml new file mode 100644 index 000000000..6eff24213 --- /dev/null +++ b/taskweaver/ext_role/recepta/recepta.role.yaml @@ -0,0 +1,7 @@ +alias: Recepta +module: taskweaver.ext_role.recepta.recepta.Recepta +intro : |- + - Recepta is responsible for helping the Planner to record the reasoning process. + - When the Planner is reasoning and do not involve any Workers, it must send a message to Recepta to record the reasoning process. + - When Planner needs to reason in the middle of the task and is not ready to talk to the User, it can send a message to `Recepta` to record the reasoning process and result. The message should be formatted as "Thought 1: reasoning...\nThought 2: reasoning...\n...Result: result...". + - The reasoning result should be insights or conclusion derived for the task step, NOT a plan or a set of instructions for further action. diff --git a/taskweaver/memory/attachment.py b/taskweaver/memory/attachment.py index f95566bff..afc1346d4 100644 --- a/taskweaver/memory/attachment.py +++ b/taskweaver/memory/attachment.py @@ -12,7 +12,8 @@ class AttachmentType(Enum): init_plan = "init_plan" plan = "plan" current_plan_step = "current_plan_step" - reasoning = "reasoning" + plan_reasoning = "plan_reasoning" + stop = "stop" # CodeInterpreter - generate code thought = "thought" diff --git a/taskweaver/planner/planner_prompt.yaml b/taskweaver/planner/planner_prompt.yaml index b4703a74c..1b87b82c0 100644 --- a/taskweaver/planner/planner_prompt.yaml +++ b/taskweaver/planner/planner_prompt.yaml @@ -1,4 +1,4 @@ -version: 0.4 +version: 0.5 instruction_template: |- You are the Planner who can coordinate Workers to finish the user task. @@ -20,16 +20,28 @@ instruction_template: |- {worker_intro} ## Planner Character - - Planner's main job is to make planning and to instruct Workers to resolve the request from the User. - - Planner should independently handle basic tasks such as information extraction from text files, using its reasoning and comprehension skills before considering the involvement of Workers. - - Planner can assign different subtasks to different Workers, and each subtask should be assigned to only one Worker. - - Planner must reject the User's request if it contains potential security risks or illegal activities. + - Planner's main job is to make planning and collaborate with Workers to resolve the request from the User. + - Planner has the following cognitive skills: + + Reasoning: Analyzes user requests, worker responses, and environmental context to solve problems. + + Reading and Comprehension: Understands and interprets unstructured or structured information accurately. + + Pattern Recognition/Matching: Identifies and utilizes patterns in information. + + Comparison: Evaluates and contrasts information to draw conclusions. + + Adaptability: Adjusts plans and strategies accordingly based on new information or observations. + + Communication: Effectively conveys and receives information. + - Planner should use its skills before considering the involvement of Workers for direct engagement and immediate results. + - Planner can assign subtasks to Workers when the task requires specific skills beyond the Planner's capabilities, and each subtask should be assigned to only one Worker. - Planner should ask the User to provide additional information critical for problem solving, but only after trying the best. - - Planner can talk to the User and Workers by specifying the `send_to` field in the response, but MUST NOT talk to the Planner itself. - Planner should refine the plan according to its observations from the replies of the Workers or the new requests of User. - - Planner needs to inform Workers on the User's request, the current step, and necessary information to complete the task. - Planner must thoroughly review Worker's response and provide feedback to the Worker if the response is incorrect or incomplete. - Planner can ignore the permission or file access issues since Workers are powerful and can handle them. + - Planner must reject the User's request if it contains potential security risks or illegal activities. + + ## Planner's reasoning process + - Planner has two reasoning modes: reasoning before making the plans and reasoning when focusing on the current task step. + - Planner should reason before making the plans which is about why the Planner makes the plan in this way. + - When Planner is focused on the current task step, Planner have two options: + 1. Planner send a message to a Worker to execute the task step. + 2. Planner use its own skills to complete the task step, which is recommended when the task step is simple. ## Planner's planning process You need to make a step-by-step plan to complete the User's task. The planning process includes 2 phases: `init_plan` and `plan`. @@ -51,6 +63,18 @@ instruction_template: |- - Planner should not merge steps with interactive dependency. - The final plan must not contain dependency annotations. + ## Planner's communication process + - Planner should communicate with the User and Workers by specifying the `send_to` field in the response. + - Planner should not talk to itself. + - Planner needs to inform Workers on the User's request, the current step, and necessary information to complete the task. + - Planner should provide the reason before talking to the User in the response: + + Completed: The task is completed successfully. + + Clarification: The User's request is unclear or ambiguous and requires clarification. + + AdditionalInformation: The User's request is incomplete or missing critical information and requires additional information. + + SecurityRisks: The User's request contains potential security risks or illegal activities and requires rejection. + + TaskFailure: The task fails after few attempts and requires the User's confirmation to proceed. + + ### Examples of planning process [Example 1] User: count rows for ./data.csv @@ -101,6 +125,7 @@ instruction_template: |- ## Planner's useful tips - When the request involves loading a file or pulling a table from db, Planner should always set the first subtask to reading the content to understand the structure or schema of the data. - When the request involves text analysis, Planner should always set the first subtask to read and print the text content to understand its content structure. + - When the request involves read instructions for task execution, Planner should always update the plan to the steps and sub-steps in the instructions and then follow the updated plan to execute necessary actions. ## Planner's response format - Planner must strictly format the response into the following JSON object: @@ -119,7 +144,7 @@ response_json_schema: |- "response": { "type": "object", "properties": { - "reasoning": { + "plan_reasoning": { "type": "string", "description": "The reasoning of the Planner's decision. It should include the analysis of the User's request, the Workers' responses, and the current environment context." }, @@ -135,6 +160,11 @@ response_json_schema: |- "type": "string", "description": "The current step Planner is executing." }, + "stop": { + "type": "string", + "description": "The stop reason when the Planner needs to talk to the User. Set it to 'InProcess' if the Planner is not talking to the User.", + "enum": ["InProcess", "Completed", "Clarification", "AdditionalInformation", "SecurityRisks", "TaskFailure"] + }, "send_to": { "type": "string", "description": "The name of character (User or name of the Worker) that Planner wants to speak to." @@ -145,10 +175,11 @@ response_json_schema: |- } }, "required": [ - "reasoning", + "plan_reasoning", "init_plan", "plan", "current_plan_step", + "stop", "send_to", "message" ], diff --git a/website/blog/reasoning.md b/website/blog/reasoning.md new file mode 100644 index 000000000..4ceeb6b0f --- /dev/null +++ b/website/blog/reasoning.md @@ -0,0 +1,480 @@ +# What makes a good agent reasoning framework? + +An agent can listen to the user's request, understand the context, make plans, take actions, observe the results, and respond to the user. Its behavior is driven by the reasoning process, which is the core of the agent's intelligence. + + + +There are many techniques developed to build an agent reasoning framework. For example, Chain-of-Thought (CoT) is one of the most popular reasoning approaches that uses a chain of thought to generate next-step actions. Another technique is ReAct, where the agent reflect on its action results to adjust its future actions. In TaskWeaver, one of the key reasoning technique is task decomposition and tracking, where the agent breaks down a task into subtasks and keeps track of the progress of each subtask. The following snippet shows an example of task decomposition and tracking in TaskWeaver: + +```commandline +├─► [init_plan] +│ 1. Request historical stock price data for MSFT from the user +│ 2. Request the forecasting model or method to be used for prediction +│ 3. Perform the prediction using the provided data and model +│ 4. Report the prediction results to the user +├─► [plan] +│ 1. Request historical stock price data for MSFT from the user +│ 2. Request the forecasting model or method to be used for prediction +│ 3. Perform the prediction using the provided data and model +│ 4. Report the prediction results to the user +├─► [current_plan_step] 1. Request historical stock price data for MSFT from the user +``` + +The agent lists the steps of the plan and the current step it is working on. While the agent is working on the task, it keeps track of the current step. This is useful especially when the task is complex and requires multiple steps to complete. The agent can refer to the current step to know what to do next. Without task decomposition and tracking, the agent may lose track of the task and fail to complete it. + +TaskWeaver also support ReAct-like reasoning. This is done by introducing multiple roles in the problem-solving process, especially the Planner and CodeInterpreter roles. The Planner role is responsible for planning the steps of the task, while the CodeInterpreter role is responsible for interpreting the code generated by the Planner. The Planner can reflect on the results of the CodeInterpreter and adjust the plan accordingly. In one round, the Planner and Code Interpreter can work together to complete a subtask with multiple interactions. + +However, there are still situations where the agent reasoning framework can be improved. If you have read the discussion above, you may have noticed that CoT is not implemented in TaskWeaver. + +**Why?** + +In this article, we are focused on the reasoning in the Planner. In fact, the Code Interpreter has CoT implemented as shown in the following snippet: + +```commandline +├─► [thought] To fetch historical stock price data for MSFT using yfinance, I will first try to import yfinance and download the data. If yfinance is not installed, I will provide instructions to install it. +├─► [reply_type] python +├─► [reply_content] +│ import yfinance as yf +│ +│ # Fetch historical stock price data for MSFT +│ ticker = 'MSFT' +│ msft_data = yf.download(ticker, start='2020-01-01', end='2025-01-01') +│ +│ msft_data +``` + +For the Planner, there are actually two places in the response schema where CoT can be implemented: before generating the `init_plan` and after the `current_plan_step`, as shown in the previous snippet. The former is to generate the initial plan, while the latter is to use Planner's reasoning capabilities to solve the current subtask. + +The former was not implemented in TaskWeaver for two reasons: (1) to simplify the schema and save the reasoning time/prompt size; (2) the existing `init_plan` and `plan` already represent a thought process, and adding CoT may not bring significant benefits. But as the LLM becomes faster and cheaper, it is possible to add CoT to the Planner to improve the planning quality. + +With reasoning for planning, the output of the Planner looks like this: + +```commandline + Human ▶ predict msft's stock price in the next 7 days + ╭───< Planner > + ├─► [plan_reasoning] The user wants to predict Microsoft's stock price for the next 7 days. This task involves time series forecasting, which requires historical stock price data and a forecasting model. However, the user hasn't provided any data or specified a model to use. I need to ask the user for more information. + ├─► [init_plan] + │ 1. Request historical stock price data from the user + │ 2. Ask the user to specify a forecasting model or confirm if a default model should be used + │ 3. Load the historical stock price data + │ 4. Train the forecasting model using the historical data + │ 5. Predict the stock price for the next 7 days + │ 6. Report the prediction results to the user +``` + +It looks good, but it is not the most important part of this article. The latter is more interesting and also more challenging to implement than it looks. + +Let's use an example to motivate the necessity and explain the challenges. The task is described as follows and all the data in it are fake: + +```markdown +# Task Instructions +## Step 1: Extract Employee Data +Extract the following information from the employee summary: +- `Employee ID` +- `Department` +- `Project Code` +- `Role` +- `Manager` +- `Location` + +### Employee Summary +On April 10, 2025, a review was conducted for an employee with ID `E12345`. +The employee is part of the `Development` department and is currently assigned to project code `"X987"`. +The role held is `"Software Engineer"` under the supervision of manager `"J. Doe."` +The primary work location is `"Remote."` + +## Step 2: Validate the Role +- Confirm if the role is recognized among the following: Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester. +- Check if the role is one of the recognized roles. + +## Step 3: Construct SQL Query +Prepare and print the SQL query using the extracted details: +=== +SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location +FROM EmployeeRecords +WHERE EmployeeID = '{EmployeeID}' +AND Department = '{Department}' +AND ProjectCode = '{ProjectCode}' +AND Role = '{Role}' +AND Manager = '{Manager}' +AND Location = '{Location}' +ORDER BY EmployeeID; +=== +``` + +The task has 3 steps: extract employee data, validate the role, and construct an SQL query. This would not be a hard problem for any ReAct based agents. Actually, for all the 3 steps, there is no need to call Code Interpreter. The Planner can reason through the task and generate the code directly. However, let's run it in TaskWeaver: + +```commandline + ╭───< Planner > + ├─► [plan_reasoning] The user has provided a task with specific instructions to be followed step by step. The task involves extracting employee data, validating the role, and constructing an SQL query. Each step is dependent on + │ the completion of the previous one, so they must be executed in sequence. + ├─► [init_plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list + │ 3. Construct the SQL query using the extracted details + │ 4. Report the constructed SQL query to the user + ├─► [plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list. + │ 3. Construct the SQL query using the extracted details. + │ 4. Report the constructed SQL query to the user. + ├─► [current_plan_step] 1. Extract the specified employee data from the summary provided. + ├─► [stop] InProcess + ├──● Please extract the following information from the employee summary: Employee ID, Department, Project Code, Role, Manager, and Location. The summary is: On April 10, 2025, a review was conducted for an employee with ID + │ `E12345`. The employee is part of the `Development` department and is currently assigned to project code `"X987"`. The role held is `"Software Engineer"` under the supervision of manager `"J. Doe."` The primary work + │ location is `"Remote."` + ├─► [shared_memory_entry] Add the plan to the shared memory + ╰──● sending message to CodeInterpreter + ╭───< CodeInterpreter > + ├─► [thought] I will extract the required information from the given employee summary using string manipulation techniques. + ├─► [reply_type] python + ├─► [reply_content] + │ employee_summary = "On April 10, 2025, a review was conducted for an employee with ID `E12345`. The employee is part of the `Development` department and is currently assigned to project code \"X987\". The role held is + │ \"Software Engineer\" under the supervision of manager \"J. Doe.\" The primary work location is \"Remote.\"" + │ + │ # Extracting information + │ employee_id = employee_summary.split('ID `')[1].split('`')[0] + │ department = employee_summary.split('`')[3] + │ project_code = employee_summary.split('project code \"')[1].split('\"')[0] + │ role = employee_summary.split('role held is \"')[1].split('\"')[0] + │ manager = employee_summary.split('manager \"')[1].split('\"')[0] + │ location = employee_summary.split('location is \"')[1].split('\"')[0] + │ + │ employee_id, department, project_code, role, manager, location + ├─► [verification] NONE + ├─► [code_error] No code verification is performed. + ├─► [execution_status] SUCCESS + ├─► [execution_result] + │ The execution of the generated python code above has succeeded + │ + │ The result of above Python code after execution is: + │ ('E12345', 'Development', 'X987', 'Software Engineer', 'J. Doe.', 'Remote.') + ├─► [artifact_paths] [] + ├──● The following python code has been executed: + │ ```python + │ employee_summary = "On April 10, 2025, a review was conducted for an employee with ID `E12345`. The employee is part of the `Development` department and is currently assigned to project code \"X987\". The role held is + │ \"Software Engineer\" under the supervision of manager \"J. Doe.\" The primary work location is \"Remote.\"" + │ + │ # Extracting information + │ employee_id = employee_summary.split('ID `')[1].split('`')[0] + │ department = employee_summary.split('`')[3] + │ project_code = employee_summary.split('project code \"')[1].split('\"')[0] + │ role = employee_summary.split('role held is \"')[1].split('\"')[0] + │ manager = employee_summary.split('manager \"')[1].split('\"')[0] + │ location = employee_summary.split('location is \"')[1].split('\"')[0] + │ + │ employee_id, department, project_code, role, manager, location + │ ``` + │ + │ + │ The execution of the generated python code above has succeeded + │ + │ The result of above Python code after execution is: + │ ('E12345', 'Development', 'X987', 'Software Engineer', 'J. Doe.', 'Remote.') + ╰──● sending message to Planner + ╭───< Planner > + ├─► [plan_reasoning] The employee data has been successfully extracted from the summary. The next step is to validate the role against the list of recognized roles. + ├─► [init_plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list + │ 3. Construct the SQL query using the extracted details + │ 4. Report the constructed SQL query to the user + ├─► [plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list. + │ 3. Construct the SQL query using the extracted details. + │ 4. Report the constructed SQL query to the user. + ├─► [current_plan_step] 2. Validate the role against the recognized roles list. + ├─► [stop] InProcess + ├──● Please validate if the role 'Software Engineer' is recognized among the following roles: Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester. + ├─► [shared_memory_entry] Add the plan to the shared memory + ╰──● sending message to CodeInterpreter + ╭───< CodeInterpreter > + ├─► [thought] I will validate if the role 'Software Engineer' is recognized among the given list of roles. + ├─► [reply_type] python + ├─► [reply_content] + │ # Step 2: Validate the Role + │ recognized_roles = ["Software Engineer", "Data Analyst", "Project Manager", "DevOps Specialist", "QA Tester"] + │ role = "Software Engineer" + │ + │ is_role_recognized = role in recognized_roles + │ is_role_recognized + ├─► [verification] NONE + ├─► [code_error] No code verification is performed. + ├─► [execution_status] SUCCESS + ├─► [execution_result] + │ The execution of the generated python code above has succeeded + │ + │ The result of above Python code after execution is: + │ True + ├─► [artifact_paths] [] + ├──● The following python code has been executed: + │ ```python + │ # Step 2: Validate the Role + │ recognized_roles = ["Software Engineer", "Data Analyst", "Project Manager", "DevOps Specialist", "QA Tester"] + │ role = "Software Engineer" + │ + │ is_role_recognized = role in recognized_roles + │ is_role_recognized + │ ``` + │ + │ + │ The execution of the generated python code above has succeeded + │ + │ The result of above Python code after execution is: + │ True + ╰──● sending message to Planner + ╭───< Planner > + ├─► [plan_reasoning] The role 'Software Engineer' is recognized among the list of roles. The next step is to construct the SQL query using the extracted details. + ├─► [init_plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list + │ 3. Construct the SQL query using the extracted details + │ 4. Report the constructed SQL query to the user + ├─► [plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list. + │ 3. Construct the SQL query using the extracted details. + │ 4. Report the constructed SQL query to the user. + ├─► [current_plan_step] 3. Construct the SQL query using the extracted details. + ├─► [stop] InProcess + ├──● Please construct the SQL query using the extracted details: Employee ID 'E12345', Department 'Development', Project Code 'X987', Role 'Software Engineer', Manager 'J. Doe.', and Location 'Remote'. The SQL query should be + │ in the format provided in the task instructions. + ├─► [shared_memory_entry] Add the plan to the shared memory + ╰──● sending message to CodeInterpreter + ╭───< CodeInterpreter > + ├─► [thought] I will construct the SQL query using the extracted details from the previous successful round. + ├─► [reply_type] python + ├─► [reply_content] + │ # Step 3: Construct SQL Query + │ employee_id = 'E12345' + │ department = 'Development' + │ project_code = 'X987' + │ role = 'Software Engineer' + │ manager = 'J. Doe.' + │ location = 'Remote.' + │ + │ sql_query = f""" + │ SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location + │ FROM EmployeeRecords + │ WHERE EmployeeID = '{employee_id}' + │ AND Department = '{department}' + │ AND ProjectCode = '{project_code}' + │ AND Role = '{role}' + │ AND Manager = '{manager}' + │ AND Location = '{location}' + │ ORDER BY EmployeeID; + │ """ + │ + │ sql_query + ├─► [verification] NONE + ├─► [code_error] No code verification is performed. + ├─► [execution_status] SUCCESS + ├─► [execution_result] + │ The execution of the generated python code above has succeeded + │ + │ The result of above Python code after execution is: + │ + │ SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location + │ FROM EmployeeRecords + │ WHERE EmployeeID = 'E12345' + │ AND Department = 'Development' + │ AND ProjectCode = 'X987' + │ AND Role = 'Software Engineer' + │ AND Manager = 'J. Doe.' + │ AND Location = 'Remote.' + │ ORDER BY EmployeeID; + │ + ├─► [artifact_paths] [] + ├──● The following python code has been executed: + │ ```python + │ # Step 3: Construct SQL Query + │ employee_id = 'E12345' + │ department = 'Development' + │ project_code = 'X987' + │ role = 'Software Engineer' + │ manager = 'J. Doe.' + │ location = 'Remote.' + │ + │ sql_query = f""" + │ SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location + │ FROM EmployeeRecords + │ WHERE EmployeeID = '{employee_id}' + │ AND Department = '{department}' + │ AND ProjectCode = '{project_code}' + │ AND Role = '{role}' + │ AND Manager = '{manager}' + │ AND Location = '{location}' + │ ORDER BY EmployeeID; + │ """ + │ + │ sql_query + │ ``` + │ + │ + │ The execution of the generated python code above has succeeded + │ + │ The result of above Python code after execution is: + │ + │ SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location + │ FROM EmployeeRecords + │ WHERE EmployeeID = 'E12345' + │ AND Department = 'Development' + │ AND ProjectCode = 'X987' + │ AND Role = 'Software Engineer' + │ AND Manager = 'J. Doe.' + │ AND Location = 'Remote.' + │ ORDER BY EmployeeID; + │ + ╰──● sending message to Planner + ╭───< Planner > + ├─► [plan_reasoning] The SQL query has been successfully constructed using the extracted employee details. The task is now complete, and the constructed SQL query can be reported to the user. + ├─► [init_plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list + │ 3. Construct the SQL query using the extracted details + │ 4. Report the constructed SQL query to the user + ├─► [plan] + │ 1. Extract the specified employee data from the summary provided. + │ 2. Validate the role against the recognized roles list. + │ 3. Construct the SQL query using the extracted details. + │ 4. Report the constructed SQL query to the user. + ├─► [current_plan_step] 4. Report the constructed SQL query to the user. + ├─► [stop] Completed + ├──● The SQL query has been successfully constructed using the extracted details: + │ + │ ``` + │ SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location + │ FROM EmployeeRecords + │ WHERE EmployeeID = 'E12345' + │ AND Department = 'Development' + │ AND ProjectCode = 'X987' + │ AND Role = 'Software Engineer' + │ AND Manager = 'J. Doe.' + │ AND Location = 'Remote.' + │ ORDER BY EmployeeID; + │ ``` + ├─► [shared_memory_entry] Add the plan to the shared memory + ╰──● sending message to User + TaskWeaver ▶ The SQL query has been successfully constructed using the extracted details: + +SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location +FROM EmployeeRecords +WHERE EmployeeID = 'E12345' +AND Department = 'Development' +AND ProjectCode = 'X987' +AND Role = 'Software Engineer' +AND Manager = 'J. Doe.' +AND Location = 'Remote.' +ORDER BY EmployeeID; +``` +The log is quite lengthy, but it is quite clear and easy to understand. For each step, the Planner always asks the Code Interpreter to generate the code. Finally, the task is completed successfully. However, it is not the most efficient way to solve the task because the Planner can leverage its reasoning capabilities to extract information and construct the SQL query directly. + +We have tried to add instructions in the Planner's prompt to emphasize the reasoning power, but the Planner still calls the Code Interpreter to generate the code. For a long time, we don't quite understand the reason behind this behavior. After many discussions and experiments, we finally realized that it is caused by the design of the Planner's response schema. The Planner needs to talk to another Role in the system each time. But if we want the Planner to use reasoning to solve the current subtask, who should the Planner talk to? Ideally, it should continue until the subtask that requires the Code Interpreter to generate the code. But how can we design the schema to support this? Because the Planner is explicitly tracking the progress, i.e., the `current_plan_step`, it is confusing to let the Planner continue the reasoning process through future steps. + +We value the progress tracking more than the efficiency of the reasoning process, because this is critical for complex multi-step tasks. But we also want to improve the reasoning process to make the agent more intelligent. So, our choice is to add a special Role called `Recepta` to listen to the Planner's reasoning process when the Planner wants to use reasoning to solve the current subtask. The `Recepta` always answers "Got it. Please go ahead!" to the Planner. The Planner can then continue the reasoning process until the subtask that requires the Code Interpreter to generate the code. The `Recepta` is not a real Role in the system, but a design pattern to support the reasoning process in the Planner. + +Now, let's look at the output of TaskWeaver with `Recepta`: + +```commandline +╭───< Planner > +├─► [plan_reasoning] The user has provided a set of instructions to follow step by step. The task involves extracting employee data from a summary, validating the role, and constructing an SQL query. I will follow these steps sequentially. +├─► [init_plan] +│ 1. Extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location` +│ 2. Validate the role +│ 3. Construct and print the SQL query using the extracted details +│ 4. Check the execution result and report the result to the user +├─► [plan] +│ 1. Extract information from the employee summary +│ 2. Validate the role +│ 3. Construct SQL query using the extracted details +│ 4. Report the result to the user +├─► [current_plan_step] 1. Extract information from the employee summary +├─► [stop] InProcess +├──● Thought: I can read and comprehend the employee summary to extract the required information. +│ Result: the `Employee ID` is `E12345`, the `Department` is `Development`, the `Project Code` is `X987`, the `Role` is `Software Engineer`, the `Manager` is `J. Doe`, and the `Location` is `Remote`. +├─► [shared_memory_entry] Add the plan to the shared memory +╰──● sending message to Recepta +╭───< Recepta > +├──● Got it. Now, please go ahead! +╰──● sending message to Planner +╭───< Planner > +├─► [plan_reasoning] I have extracted the information from the employee summary. The next step is to verify if the role is one of the recognized roles. +├─► [init_plan] +│ 1. Extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location` +│ 2. Validate the role +│ 3. Construct and print the SQL query using the extracted details +│ 4. Check the execution result and report the result to the user +├─► [plan] +│ 1. Extract information from the employee summary +│ 2. Validate the role +│ 3. Construct SQL query using the extracted details +│ 4. Report the result to the user +├─► [current_plan_step] 2. Validate the role +├─► [stop] InProcess +├──● Thought: I can verify if the role is one of the recognized roles with my pattern matching capability. +│ Thought: The recognized roles are `Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester`. +│ Result: The role is `Software Engineer`, which is recognized. +├─► [shared_memory_entry] Add the plan to the shared memory +╰──● sending message to Recepta +╭───< Recepta > +├──● Got it. Now, please go ahead! +╰──● sending message to Planner +╭───< Planner > +├─► [plan_reasoning] I can prepare and print the SQL query using the extracted details to the User. +├─► [init_plan] +│ 1. Extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location` +│ 2. Validate the role +│ 3. Construct and print the SQL query using the extracted details +│ 4. Check the execution result and report the result to the user +├─► [plan] +│ 1. Extract information from the employee summary +│ 2. Validate the role +│ 3. Construct SQL query using the extracted details +│ 4. Report the result to the user +├─► [current_plan_step] 3. Construct and print the SQL query using the extracted details and report the result to the user +├─► [stop] Completed +├──● The role is `Software Engineer`, which is recognized. +│ +│ Here is the SQL query using the extracted details: +│ ```sql +│ SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location +│ FROM EmployeeRecords +│ WHERE EmployeeID = 'E12345' +│ AND Department = 'Development' +│ AND ProjectCode = 'X987' +│ AND Role = 'Software Engineer' +│ AND Manager = 'J. Doe' +│ AND Location = 'Remote' +│ ORDER BY EmployeeID; +│ ``` +├─► [shared_memory_entry] Add the plan to the shared memory +╰──● sending message to User + TaskWeaver ▶ The role is `Software Engineer`, which is recognized. + +Here is the SQL query using the extracted details: + +SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location +FROM EmployeeRecords +WHERE EmployeeID = 'E12345' +AND Department = 'Development' +AND ProjectCode = 'X987' +AND Role = 'Software Engineer' +AND Manager = 'J. Doe' +AND Location = 'Remote' +ORDER BY EmployeeID; +``` + +The new log is not only shorter but also more efficient. It is also more reliable according to our observations. Writing code with regex to extract information from text is error-prone. + +In summary, we have experimented the idea of adding a special Role called `Recepta` to support the reasoning process in the Planner. The `Recepta` is not a real Role in the system, but a design pattern to support the reasoning process in the Planner. This design pattern is simple and effective. It is also flexible and can be extended to support more complex reasoning processes in the Planner. We believe that this design pattern can be useful for building more intelligent agents in the future. + +To enable it in TaskWeaver, the user needs to configure the roles in the configure file with the following content: +```json +"session.roles": [ + "planner", + "code_interpreter", + "recepta" +] +``` +The `recepta` role is added to the list of roles. \ No newline at end of file