Skip to content

Commit 2ac690e

Browse files
committed
feat: 更新聊天代理和任务管理器,支持通过线程ID管理消息,优化日志记录和敏感内容检查
1 parent 54bb8e1 commit 2ac690e

File tree

8 files changed

+132
-86
lines changed

8 files changed

+132
-86
lines changed

docs/advanced/misc.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,21 @@
22

33
## 内容安全
44

5-
系统内置内容审查机制,保障服务内容的合规性。目前配置了关键词过滤以及 LLM 对内容进行审查。管理员可在 `设置``基本设置` 页面中进行配置并选择安全模型。
5+
系统内置内容审查机制(默认是关闭状态),保障服务内容的合规性。目前配置了关键词过滤以及 LLM 对内容进行审查。管理员可在 `设置``基本设置` 页面中进行配置并选择安全模型。
66

7-
敏感词词库位于 `src/config/static/bad_keywords.txt` 文件,每行一个关键词,实时生效,无需重启服务。
7+
检测流程为,接收到用户输入之后,就对用户的输入进行检测是否合规,同时在流式传输的过程中进行实时检测(仅关键词)。当流式输出结束之后,则开始检测整个内容。
8+
**注意**,使用 LLM 检测虽然可以大大缓解提示词注入带来的问题,但也会在用户交互上带来延迟影响,需要考虑是否启用。
9+
10+
对于关键词检测,敏感词词库位于 `src/config/static/bad_keywords.txt` 文件,每行一个关键词,实时生效,无需重启服务。
11+
12+
对于 LLM 检测,Prompt 可以看到 `src/plugins/guard.py`
13+
14+
<<< @/../src/plugins/guard.py#guard_prompt
815

916
## 网页搜索
1017

1118
系统内置了基于 Tavily 的联网搜索能力,配置完成后,大模型会自动在需要时调用 `enable_web_search` 对应的工具,为回答提供实时网页信息。
1219

13-
### 配置步骤
1420

1521
1. 前往 [Tavily 官网](https://app.tavily.com/) 注册并在控制台创建 API Key。
1622
2. 在项目根目录的 `.env`(或 `docker-compose.yml` 中的对应环境变量段)写入:

docs/changelog/roadmap.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,9 @@
99
- [x] 部分 doc 格式的文件支持有问题
1010
- [x] 当出现不支持的文件类型的时候,前端没有限制
1111
- [ ] 当消息生成的时候有报错的时候,前端无显示
12-
- [ ] 另外一个智能体的历史对话无法显示
13-
- [ ] 调用统计的统计结果有问题(Token 计算方法可能也不对)
12+
- [x] 另外一个智能体的历史对话无法显示
13+
- [ ] 调用统计的统计结果疑似有问题(Token 计算方法可能也不对)
14+
- [ ] 【重要】传输给 agent 的上下文消息有问题,需要基于新的 conv 构建阶段算法
1415

1516

1617
---
@@ -27,13 +28,13 @@
2728
📝 **Base**
2829

2930
- [ ] 优化全局配置的管理模型,以及配置文件的管理,子配置的管理
30-
- [ ] 新建 tasker 模块,用来管理所有的后台任务,UI 上使用侧边栏管理。
31+
- [x] 新建 tasker 模块,用来管理所有的后台任务,UI 上使用侧边栏管理。
3132
- [ ] 新增 files 模块,用来管理文件上传,下载等
3233

3334
## 未来可能会支持
3435

3536
下面的功能**可能**会放在后续版本实现,暂时未定
3637

37-
- [ ] 添加测试脚本,覆盖最常见的功能
38-
- [ ] 优化对文档信息的检索展示(检索结果页、详情页)
38+
- [x] 添加测试脚本,覆盖最常见的功能(已覆盖API)
39+
- [x] 优化对文档信息的检索展示(检索结果页、详情页)
3940
- [ ] 集成 LangFuse (观望)添加用户日志与用户反馈模块,可以在 AgentView 中查看信息

docs/intro/quick-start.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,7 @@ cd Yuxi-Know
2020

2121
::: warning 版本说明
2222
- `0.2.2`: 当前稳定版本(推荐)
23-
- `stable`: 旧版本稳定分支(与现版本不兼容)
24-
- `main`: 最新开发版本(可能不稳定)
23+
- `main`: 最新开发版本(不稳定,新特性可能会导致新 bug)
2524
:::
2625

2726
#### 2. 配置环境变量

server/routers/chat_router.py

Lines changed: 53 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ def make_chunk(content=None, **kwargs):
142142

143143
async def save_messages_from_langgraph_state(
144144
agent_instance,
145-
conversation,
145+
thread_id,
146146
conv_mgr,
147147
config_dict,
148148
):
@@ -162,11 +162,11 @@ async def save_messages_from_langgraph_state(
162162
logger.debug(f"Retrieved {len(messages)} messages from LangGraph state")
163163

164164
# 获取已保存的消息数量,避免重复保存
165-
existing_messages = conv_mgr.get_messages(conversation.id)
165+
existing_messages = conv_mgr.get_messages_by_thread_id(thread_id)
166166
existing_count = len(existing_messages)
167167

168168
# 只保存新增的消息
169-
new_messages = messages[existing_count:]
169+
new_messages = messages
170170

171171
for msg in new_messages:
172172
msg_dict = msg.model_dump() if hasattr(msg, "model_dump") else {}
@@ -190,8 +190,8 @@ async def save_messages_from_langgraph_state(
190190
msg_dict["response_metadata"]["model_name"] = model_name[: len(model_name) // repeat_count]
191191

192192
# 保存 AI 消息
193-
ai_msg = conv_mgr.add_message(
194-
conversation_id=conversation.id,
193+
ai_msg = conv_mgr.add_message_by_thread_id(
194+
thread_id=thread_id,
195195
role="assistant",
196196
content=content,
197197
message_type="text",
@@ -236,6 +236,10 @@ async def save_messages_from_langgraph_state(
236236
else:
237237
logger.warning(f"Tool call {tool_call_id} not found for update")
238238

239+
else:
240+
logger.warning(f"Unknown message type: {msg_type}, skipping")
241+
continue
242+
239243
logger.debug(f"Processed message type={msg_type}")
240244

241245
logger.info(f"Saved {len(new_messages)} new messages from LangGraph state")
@@ -249,7 +253,7 @@ async def stream_messages():
249253
yield make_chunk(status="init", meta=meta, msg=HumanMessage(content=query).model_dump())
250254

251255
# Input guard
252-
if conf.enable_content_guard and content_guard.check(query):
256+
if conf.enable_content_guard and await content_guard.check(query):
253257
yield make_chunk(status="error", message="输入内容包含敏感词", meta=meta)
254258
return
255259

@@ -265,91 +269,74 @@ async def stream_messages():
265269
# 构造运行时配置,如果没有thread_id则生成一个
266270
user_id = str(current_user.id)
267271
thread_id = config.get("thread_id")
268-
269272
input_context = {"user_id": user_id, "thread_id": thread_id}
270273

274+
if not thread_id:
275+
thread_id = str(uuid.uuid4())
276+
logger.warning(f"No thread_id provided, generated new thread_id: {thread_id}")
277+
278+
271279
# Initialize conversation manager
272280
conv_manager = ConversationManager(db)
273281

274-
# Get or create conversation
275-
conversation = None
276-
if thread_id:
277-
conversation = conv_manager.get_conversation_by_thread_id(thread_id)
278-
if not conversation:
279-
try:
280-
# Auto-create conversation for existing thread
281-
conversation = conv_manager.create_conversation(
282-
user_id=user_id,
283-
agent_id=agent_id,
284-
title=(query[:50] + "..." if len(query) > 50 else query) if query else "新的对话",
285-
thread_id=thread_id,
286-
)
287-
logger.info(f"Auto-created conversation for thread_id {thread_id}")
288-
except Exception as e:
289-
logger.error(f"Failed to auto-create conversation: {e}")
290-
conversation = None
291-
292282
# Save user message
293-
if conversation:
294-
try:
295-
conv_manager.add_message(
296-
conversation_id=conversation.id,
297-
role="user",
298-
content=query,
299-
message_type="text",
300-
extra_metadata={"raw_message": HumanMessage(content=query).model_dump()},
301-
)
302-
except Exception as e:
303-
logger.error(f"Error saving user message: {e}")
283+
try:
284+
conv_manager.add_message_by_thread_id(
285+
thread_id=thread_id,
286+
role="user",
287+
content=query,
288+
message_type="text",
289+
extra_metadata={"raw_message": HumanMessage(content=query).model_dump()},
290+
)
291+
except Exception as e:
292+
logger.error(f"Error saving user message: {e}")
304293

305294
try:
306-
# Stream messages (only for display, don't save yet)
295+
full_ai_content = ""
307296
async for msg, metadata in agent.stream_messages(messages, input_context=input_context):
308297
if isinstance(msg, AIMessageChunk):
309-
# Content guard
310-
if conf.enable_content_guard and content_guard.check(msg.content):
298+
299+
full_ai_content += msg.content
300+
if conf.enable_content_guard and await content_guard.check_with_keywords(full_ai_content[-20:]):
311301
logger.warning("Sensitive content detected in stream")
312302
yield make_chunk(message="检测到敏感内容,已中断输出", status="error")
313303
return
314304

315305
yield make_chunk(content=msg.content, msg=msg.model_dump(), metadata=metadata, status="loading")
316306

317-
elif isinstance(msg, ToolMessage):
318-
yield make_chunk(msg=msg.model_dump(), metadata=metadata, status="loading")
319307
else:
320308
yield make_chunk(msg=msg.model_dump(), metadata=metadata, status="loading")
321309

310+
if conf.enable_content_guard and await content_guard.check(full_ai_content):
311+
logger.warning("Sensitive content detected in final message")
312+
yield make_chunk(message="检测到敏感内容,已中断输出", status="error")
313+
return
314+
322315
yield make_chunk(status="finished", meta=meta)
323316

324317
# After streaming finished, save all messages from LangGraph state
325-
if conversation:
326-
langgraph_config = {"configurable": {"thread_id": thread_id, "user_id": user_id}}
327-
await save_messages_from_langgraph_state(
328-
agent_instance=agent,
329-
conversation=conversation,
330-
conv_mgr=conv_manager,
331-
config_dict=langgraph_config,
332-
)
318+
langgraph_config = {"configurable": input_context}
319+
await save_messages_from_langgraph_state(
320+
agent_instance=agent,
321+
thread_id=thread_id,
322+
conv_mgr=conv_manager,
323+
config_dict=langgraph_config,
324+
)
325+
333326
except (asyncio.CancelledError, ConnectionError) as e:
334327
# 客户端主动中断连接,尝试保存已生成的部分内容
335-
logger.info(f"Client disconnected for thread {thread_id}: {e}")
336-
try:
337-
if conversation:
338-
langgraph_config = {"configurable": {"thread_id": thread_id, "user_id": user_id}}
339-
await save_messages_from_langgraph_state(
340-
agent_instance=agent,
341-
conversation=conversation,
342-
conv_mgr=conv_manager,
343-
config_dict=langgraph_config,
344-
)
345-
except Exception as save_error:
346-
logger.error(f"Error saving partial messages after disconnect: {save_error}")
328+
logger.warning(f"Client disconnected for thread {thread_id}: {e}")
329+
langgraph_config = {"configurable": input_context}
330+
await save_messages_from_langgraph_state(
331+
agent_instance=agent,
332+
thread_id=thread_id,
333+
conv_mgr=conv_manager,
334+
config_dict=langgraph_config,
335+
)
336+
347337
# 通知前端中断(可能发送不到,但用于一致性)
348-
try:
349-
yield make_chunk(status="interrupted", message="对话已中断", meta=meta)
350-
except Exception:
351-
pass
352-
return
338+
yield make_chunk(status="interrupted", message="对话已中断", meta=meta)
339+
353340
except Exception as e:
354341
logger.error(f"Error streaming messages: {e}, {traceback.format_exc()}")
355342
yield make_chunk(message=f"Error streaming messages: {e}", status="error")

server/services/tasker.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ async def start(self) -> None:
105105
worker = asyncio.create_task(self._worker_loop(), name="tasker-worker")
106106
self._workers.append(worker)
107107
self._started = True
108-
logger.info("Tasker started with %s workers", self.worker_count)
108+
logger.info("Tasker started with {} workers", self.worker_count)
109109

110110
async def shutdown(self) -> None:
111111
async with self._lock:
@@ -133,7 +133,7 @@ async def enqueue(
133133
self._tasks[task_id] = task
134134
await self._persist_state()
135135
await self._queue.put((task_id, coroutine))
136-
logger.info("Enqueued task %s (%s)", task_id, name)
136+
logger.info("Enqueued task {} ({})", task_id, name)
137137
return task
138138

139139
async def list_tasks(self, status: Optional[str] = None) -> List[Dict[str, Any]]:
@@ -159,7 +159,7 @@ async def cancel_task(self, task_id: str) -> bool:
159159
task.cancel_requested = True
160160
task.updated_at = _utc_timestamp()
161161
await self._persist_state()
162-
logger.info("Cancellation requested for task %s", task_id)
162+
logger.info("Cancellation requested for task {}", task_id)
163163
return True
164164

165165
async def _worker_loop(self) -> None:
@@ -191,7 +191,7 @@ async def _worker_loop(self) -> None:
191191
except asyncio.CancelledError:
192192
await self._mark_cancelled(task_id, "任务被取消")
193193
except Exception as exc: # noqa: BLE001
194-
logger.error("Task %s failed: %s", task_id, exc, exc_info=True)
194+
logger.exception("Task {} failed: {}", task_id, exc)
195195
await self._update_task(
196196
task_id,
197197
status="failed",
@@ -205,7 +205,7 @@ async def _worker_loop(self) -> None:
205205
except asyncio.CancelledError:
206206
break
207207
except Exception as exc: # noqa: BLE001
208-
logger.error("Tasker worker error: %s", exc, exc_info=True)
208+
logger.exception("Tasker worker error: {}", exc)
209209

210210
async def _get_task_instance(self, task_id: str) -> Optional[Task]:
211211
async with self._lock:
@@ -277,9 +277,9 @@ async def _load_state(self) -> None:
277277
task.message = "服务重启时任务未继续执行"
278278
task.updated_at = _utc_timestamp()
279279
self._tasks[task.id] = task
280-
logger.info("Loaded %s task records from storage", len(tasks))
280+
logger.info("Loaded {} task records from storage", len(tasks))
281281
except Exception as exc: # noqa: BLE001
282-
logger.error("Failed to load task state: %s", exc, exc_info=True)
282+
logger.exception("Failed to load task state: {}", exc)
283283

284284
async def _persist_state(self) -> None:
285285
tasks = [task.to_dict() for task in self._tasks.values()]

src/agents/common/mcp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ async def get_mcp_client(
5454
logger.info(f"Initialized MCP client with servers: {list(configs.keys())}")
5555
return client
5656
except Exception as e:
57-
logger.error("Failed to initialize MCP client: %s", e)
57+
logger.error("Failed to initialize MCP client: {}", e)
5858
return None
5959

6060

src/plugins/guard.py

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
from src.models import select_model
55
from src.utils import logger
66

7+
# region guard_prompt
78
PROMPT_TEMPLATE = """
89
# 指令
910
你是一个内容合规性检测助手。请根据提供的规则集,判断以下内容是否符合合规性要求。
@@ -26,6 +27,7 @@
2627
2728
输入内容:{content}
2829
输出内容:"""
30+
# endregion guard_prompt
2931

3032

3133
def load_keywords(file_path: str) -> list[str]:
@@ -52,22 +54,38 @@ def __init__(self, keywords_file: str = "src/config/static/bad_keywords.txt"):
5254
else:
5355
self.llm_model = None
5456

55-
def check(self, text: str) -> bool:
57+
async def check(self, text: str) -> bool:
5658
"""
5759
Checks if the text contains any sensitive keywords.
5860
Returns True if sensitive content is found, False otherwise.
5961
True: 不合规
6062
False: 合规
6163
"""
64+
if keywords_result := await self.check_with_keywords(text):
65+
return keywords_result
66+
67+
if self.llm_model:
68+
return await self.check_with_llm(text)
69+
70+
return False
71+
72+
async def check_with_keywords(self, text: str) -> bool:
73+
"""
74+
Checks if the text contains any sensitive keywords from the predefined list.
75+
Returns True if sensitive content is found, False otherwise.
76+
True: 不合规
77+
False: 合规
78+
"""
6279
if not text:
6380
return False
6481
text_lower = text.lower()
6582
for keyword in self.keywords:
6683
if keyword in text_lower:
84+
logger.debug(f"Keyword match found: {keyword}")
6785
return True
6886
return False
6987

70-
def check_with_llm(self, text: str) -> bool:
88+
async def check_with_llm(self, text: str) -> bool:
7189
"""
7290
Checks if the text contains any sensitive keywords using an LLM.
7391
Returns True if sensitive content is found, False otherwise.

0 commit comments

Comments
 (0)