Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Unlike many agent frameworks that only track the chat history with LLMs in text,


## 🆕 News
- 📅2025-03-13: TaskWeaver now supports vision input for the Planner role. Please check the [vision input](https://microsoft.github.io/TaskWeaver/blog/vision) for more details.👀
- 📅2025-01-16: TaskWeaver has been enhanced with an experimental role called [Recepta](https://microsoft.github.io/TaskWeaver/blog/reasoning) for its reasoning power.🧠
- 📅2024-12-23: TaskWeaver has been integrated with the [AgentOps](https://microsoft.github.io/TaskWeaver/docs/observability) for better observability and monitoring.🔍
- 📅2024-09-13: We introduce the shared memory to store information that is shared between the roles in TaskWeaver. Please check the [memory](https://microsoft.github.io/TaskWeaver/docs/memory) for more details.🧠
Expand All @@ -31,7 +32,7 @@ Unlike many agent frameworks that only track the chat history with LLMs in text,
- 📅2024-05-07: We have added two blog posts on [Evaluating a LLM agent](https://microsoft.github.io/TaskWeaver/blog/evaluation) and [Adding new roles to TaskWeaver](https://microsoft.github.io/TaskWeaver/blog/role) in the documentation.📝
- 📅2024-03-28: TaskWeaver now offers all-in-one Docker image, providing a convenient one-stop experience for users. Please check the [docker](https://microsoft.github.io/TaskWeaver/docs/usage/docker) for more details.🐳
- 📅2024-03-27: TaskWeaver now switches to `container` mode by default for code execution. Please check the [code execution](https://microsoft.github.io/TaskWeaver/docs/code_execution) for more details.🐳
- 📅2024-03-07: TaskWeaver now supports configuration of different LLMs for various components, such as the Planner and CodeInterpreter. Please check the [multi-llm](https://microsoft.github.io/TaskWeaver/docs/llms/multi-llm) for more details.🔗
<!-- - 📅2024-03-07: TaskWeaver now supports configuration of different LLMs for various components, such as the Planner and CodeInterpreter. Please check the [multi-llm](https://microsoft.github.io/TaskWeaver/docs/llms/multi-llm) for more details.🔗 -->
<!-- - 📅2024-03-04: TaskWeaver now supports a [container](https://microsoft.github.io/TaskWeaver/docs/code_execution) mode, which provides a more secure environment for code execution.🐳 -->
<!-- - 📅2024-02-28: TaskWeaver now offers a [CLI-only](https://microsoft.github.io/TaskWeaver/docs/advanced/cli_only) mode, enabling users to interact seamlessly with the Command Line Interface (CLI) using natural language.📟 -->
<!-- - 📅2024-02-01: TaskWeaver now has a plugin [document_retriever](https://github.com/microsoft/TaskWeaver/blob/main/project/plugins/README.md#document_retriever) for RAG based on a knowledge base.📚 -->
Expand All @@ -43,7 +44,8 @@ Unlike many agent frameworks that only track the chat history with LLMs in text,
<!-- - 📅2023-12-21: TaskWeaver now supports a number of LLMs, such as LiteLLM, Ollama, Gemini, and QWen🎈.) -->
<!-- - 📅2023-12-21: TaskWeaver Website is now [available]&#40;https://microsoft.github.io/TaskWeaver/&#41; with more documentations.) -->
<!-- - 📅2023-12-12: A simple UI demo is available in playground/UI folder, try it [here](https://microsoft.github.io/TaskWeaver/docs/usage/webui)! -->
<!-- - 📅2023-11-30: TaskWeaver is released on GitHub🎈. -->
- ......
- 📅2023-11-30: TaskWeaver is released on GitHub🎈.


## 💥 Highlights
Expand All @@ -68,7 +70,6 @@ We are looking forward to your contributions to make TaskWeaver better.
- [ ] Support for prompt template management
- [ ] Better plugin experiences, such as displaying updates or stopping in the middle of running the plugin and user confirmation before running the plugin
- [ ] Async interaction with LLMs
- [ ] Support for vision input for Roles such as the Planner and CodeInterpreter
- [ ] Support for remote code execution


Expand Down
2 changes: 1 addition & 1 deletion taskweaver/chat/console/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -498,7 +498,7 @@ def _reset_session(self, first_session: bool = False):
self.session.stop()
self.session = self.app.get_session()

self._system_message("--- new session starts ---")
self._system_message("--- new session started ---")
self._assistant_message(
"I am TaskWeaver, an AI assistant. To get started, could you please enter your request?",
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ def compose_conversation(
# for code correction
user_message += self.user_message_head_template.format(
FEEDBACK=format_code_feedback(post),
MESSAGE=f"{post.get_attachment(AttachmentType.revise_message)[0]}",
MESSAGE=f"{post.get_attachment(AttachmentType.revise_message)[0].content}",
)

assistant_message = self.post_translator.post_to_raw_text(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,12 @@ def reply(
prompt_log_path=prompt_log_path,
)

code = post_proxy.post.get_attachment(type=AttachmentType.reply_content)[0]
code = post_proxy.post.get_attachment(type=AttachmentType.reply_content)[0].content
if len(code) == 0:
post_proxy.update_message(post_proxy.post.get_attachment(type=AttachmentType.thought)[0], is_end=True)
post_proxy.update_message(
post_proxy.post.get_attachment(type=AttachmentType.thought)[0].content,
is_end=True,
)
return post_proxy.end()

code_to_exec = "! " + code
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def reply(
return post_proxy.end()

functions = json.loads(
post_proxy.post.get_attachment(type=AttachmentType.function)[0],
post_proxy.post.get_attachment(type=AttachmentType.function)[0].content,
)
if len(functions) > 0:
code: List[str] = []
Expand Down
Empty file.
119 changes: 119 additions & 0 deletions taskweaver/ext_role/image_reader/image_reader.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import base64
import json
import os.path
from mimetypes import guess_type

from injector import inject

from taskweaver.llm import LLMApi, format_chat_message
from taskweaver.logging import TelemetryLogger
from taskweaver.memory import Memory, Post
from taskweaver.memory.attachment import AttachmentType
from taskweaver.module.event_emitter import SessionEventEmitter
from taskweaver.module.tracing import Tracing
from taskweaver.role import Role
from taskweaver.role.role import RoleConfig, RoleEntry
from taskweaver.session import SessionMetadata


# Function to encode a local image into data URL
def local_image_to_data_url(image_path):
# Guess the MIME type of the image based on the file extension
mime_type, _ = guess_type(image_path)
if mime_type is None:
mime_type = "application/octet-stream" # Default MIME type if none is found

try:
# Read and encode the image file
with open(image_path, "rb") as image_file:
base64_encoded_data = base64.b64encode(image_file.read()).decode("utf-8")
except FileNotFoundError:
logger.error(f"Error: The file {image_path} does not exist.")
return None
except IOError:
logger.error(f"Error: The file {image_path} could not be read.")
return None
# Construct the data URL
return f"data:{mime_type};base64,{base64_encoded_data}"


class ImageReaderConfig(RoleConfig):
def _configure(self):
pass


class ImageReader(Role):
@inject
def __init__(
self,
config: ImageReaderConfig,
logger: TelemetryLogger,
tracing: Tracing,
event_emitter: SessionEventEmitter,
role_entry: RoleEntry,
llm_api: LLMApi,
session_metadata: SessionMetadata,
):
super().__init__(config, logger, tracing, event_emitter, role_entry)

self.llm_api = llm_api
self.session_metadata = session_metadata

def reply(self, memory: Memory, **kwargs: ...) -> Post:
rounds = memory.get_role_rounds(
role=self.alias,
include_failure_rounds=False,
)

# obtain the query from the last round
last_post = rounds[-1].post_list[-1]

post_proxy = self.event_emitter.create_post_proxy(self.alias)

post_proxy.update_send_to(last_post.send_from)

input_message = last_post.message
prompt = (
f"Input message: {input_message}.\n"
"\n"
"Your response should be a JSON object with the key 'image_url' and the value as the image path. "
"For example, {'image_url': 'c:/images/image.jpg'} or {'image_url': 'http://example.com/image.jpg'}. "
"Do not add any additional information in the response or wrap the JSON with ```json and ```."
)

response = self.llm_api.chat_completion(
messages=[
format_chat_message(
role="system",
message="Your task is to read the image path from the message.",
),
format_chat_message(
role="user",
message=prompt,
),
],
)

image_url = json.loads(response["content"])["image_url"]
if image_url.startswith("http"):
image_content = image_url
attachment_message = f"Image from {image_url}."
else:
if os.path.isabs(image_url):
image_content = local_image_to_data_url(image_url)
else:
image_content = local_image_to_data_url(os.path.join(self.session_metadata.execution_cwd, image_url))
attachment_message = f"Image from {image_url} encoded as a Base64 data URL."

post_proxy.update_attachment(
message=attachment_message,
type=AttachmentType.image_url,
extra={"image_url": image_content},
is_end=True,
)

post_proxy.update_message(
"I have read the image path from the message. The image is attached below.",
)

return post_proxy.end()
5 changes: 5 additions & 0 deletions taskweaver/ext_role/image_reader/image_reader.role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
alias: ImageReader
module: taskweaver.ext_role.image_reader.image_reader.ImageReader
intro : |-
- ImageReader is responsible for helping the Planner to read images.
- The input message must contain the image path, either local or remote.
40 changes: 35 additions & 5 deletions taskweaver/llm/util.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
from typing import Any, Dict, List, Literal, Optional, TypedDict, Union

ChatMessageRoleType = Literal["system", "user", "assistant", "function"]
ChatMessageType = Dict[Literal["role", "name", "content"], str]
ChatContentType = Dict[Literal["type", "text", "image_url"], str | Dict[Literal["url"], str]]
ChatMessageType = Dict[Literal["role", "name", "content"], str | List[ChatContentType]]

PromptTypeSimple = List[ChatMessageType]


Expand All @@ -21,15 +23,43 @@ class PromptTypeWithTools(TypedDict):
tools: Optional[List[PromptToolType]]


def format_chat_message_content(
content_type: Literal["text", "image_url"],
content_value: str,
) -> ChatContentType:
if content_type == "image_url":
return {
"type": content_type,
content_type: {
"url": content_value,
},
}
else:
return {
"type": content_type,
content_type: content_value,
}


def format_chat_message(
role: ChatMessageRoleType,
message: str,
image_urls: Optional[List[str]] = None,
name: Optional[str] = None,
) -> ChatMessageType:
msg: ChatMessageType = {
"role": role,
"content": message,
}
if not image_urls:
msg: ChatMessageType = {
"role": role,
"content": message,
}
else:
msg: ChatMessageType = {
"role": role,
"content": [
format_chat_message_content("text", message),
]
+ [format_chat_message_content("image_url", image) for image in image_urls],
}
if name is not None:
msg["name"] = name
return msg
Expand Down
3 changes: 3 additions & 0 deletions taskweaver/memory/attachment.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ class AttachmentType(Enum):
# shared memory entry
shared_memory_entry = "shared_memory_entry"

# vision input
image_url = "image_url"


@dataclass
class Attachment:
Expand Down
4 changes: 2 additions & 2 deletions taskweaver/memory/post.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ def add_attachment(self, attachment: Attachment) -> None:
"""Add an attachment to the post."""
self.attachment_list.append(attachment)

def get_attachment(self, type: AttachmentType) -> List[Any]:
def get_attachment(self, type: AttachmentType) -> List[Attachment]:
"""Get all the attachments of the given type."""
return [attachment.content for attachment in self.attachment_list if attachment.type == type]
return [attachment for attachment in self.attachment_list if attachment.type == type]

def del_attachment(self, type_list: List[AttachmentType]) -> None:
"""Delete all the attachments of the given type."""
Expand Down
45 changes: 22 additions & 23 deletions taskweaver/planner/planner.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ def compose_conversation_for_prompt(
for post in chat_round.post_list:
if post.send_from == self.alias:
if post.send_to == "User" or post.send_to in self.recipient_alias_set:
# planner responses
planner_message = self.planner_post_translator.post_to_raw_text(
post=post,
)
Expand All @@ -144,47 +145,45 @@ def compose_conversation_for_prompt(
)
elif post.send_to == self.alias:
# self correction for planner response, e.g., format error/field check error
# append the invalid response to chat history
conversation.append(
format_chat_message(
role="assistant",
message=post.get_attachment(
type=AttachmentType.invalid_response,
)[0],
)[0].content,
),
)

# append the invalid response to chat history
# append the self correction instruction message to chat history
conversation.append(
format_chat_message(
role="user",
message=self.format_message(
role="User",
message=post.get_attachment(type=AttachmentType.revise_message)[0],
message=post.get_attachment(type=AttachmentType.revise_message)[0].content,
),
),
)
# append the self correction instruction message to chat history

else:
if conv_init_message is not None:
message = self.format_message(
role=post.send_from,
message=conv_init_message + "\n" + post.message,
)
conversation.append(
format_chat_message(role="user", message=message),
)
conv_init_message = None
else:
conversation.append(
format_chat_message(
role="user",
message=self.format_message(
role=post.send_from,
message=post.message,
),
# messages from user or workers
conversation.append(
format_chat_message(
role="user",
message=self.format_message(
role=post.send_from,
message=post.message
if conv_init_message is None
else conv_init_message + "\n" + post.message,
),
)
image_urls=[
attachment.extra["image_url"]
for attachment in post.get_attachment(type=AttachmentType.image_url)
],
),
)

conv_init_message = None

return conversation

Expand Down
11 changes: 11 additions & 0 deletions website/blog/authors.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
liqli:
name: Liqun Li
url: https://liqul.github.io
title: Principal Researcher
image_url: https://liqul.github.io/assets/logo_small_bw.png

xu:
name: Xu Zhang
url: https://scholar.google.com/citations?user=bqXdMMMAAAAJ&hl=zh-CN
title: Senior Researcher
image_url: https://scholar.googleusercontent.com/citations?view_op=view_photo&user=bqXdMMMAAAAJ&citpid=3
6 changes: 5 additions & 1 deletion website/blog/evaluation.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# How to evaluate a LLM agent?
---
title: How to evaluate a LLM agent?
authors: [liqli, xu]
date: 2024-05-07
---

## The challenges
It is nontrivial to evaluate the performance of a LLM agent.
Expand Down
6 changes: 5 additions & 1 deletion website/blog/experience.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# Experience selection
---
title: Experience Selection in TaskWeaver
authors: liqli
date: 2024-09-14
---

We have introduced the motivation of the `experience` module in [Experience](/docs/customization/experience)
and how to create a handcrafted experience in [Handcrafted Experience](/docs/customization/experience/handcrafted_experience).
Expand Down
Loading