Skip to content

Commit 5711244

Browse files
committed
feat(demohouse/teacher): structure
1 parent 0137315 commit 5711244

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+235
-125
lines changed

demohouse/teacher/backend/code/prompts.py

Lines changed: 0 additions & 73 deletions
This file was deleted.

demohouse/teacher_avatar/0.png

-1.34 MB
Binary file not shown.

demohouse/teacher_avatar/README.md

Lines changed: 91 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@
99
[视频地址](https://lf3-static.bytednsdoc.com/obj/eden-cn/lm_sth/ljhwZthlaukjlkulzlp/ark/assistant/videos/20250311-163309.mp4)
1010
### 直接体验
1111

12-
![](0.png)
12+
![](assets/0.png)
1313

1414
### 流程架构
1515

1616

17-
![](1.png)
17+
![](assets/1.png)
1818

1919

2020
**极简开发的场景化赋能**
@@ -44,38 +44,87 @@
4444
# 技术实现
4545

4646

47-
教师分身应用包括 Android 客户端和 Web 前端两部分
47+
教师分身应用包括 Android 客户端、 Web 前端和Python后端三部分
4848

4949
- Android 端主要提供 Web 页面容器、题目分割、语音合成、语音识别等能力
5050

5151
- Web 前端主要提供识别题目的UI交互页面以及与大模型交互的题目解析页面等
52+
53+
- Python 后端主要提供具有解题、批改、聊天能力的智能体接口
5254

5355

5456
<br>
5557

56-
本项目开源了教师分身应用中的 Web 前端代码。Web 前端基于 React 技术栈实现,负责处理大模型对话(文本+图片)、大模型流式输出、用户语音输入等模块。开发者可参考此工程的大模型接口调用、会话管理等逻辑,将其方便地移植到其他前端工程,提高开发效率。
58+
本项目开源了教师分身应用中的 Web 前端代码和Python后端代码。Web 前端基于 React 技术栈实现,负责处理大模型对话(文本+图片)、大模型流式输出、用户语音输入等模块。开发者可参考此工程的大模型接口调用、会话管理等逻辑,将其方便地移植到其他前端工程,提高开发效率。Python后端代码基于arkitect sdk实现对于Doubao-1.5-vision-pro-32k模型和DeepSeek R1模型的编排,开发者可以参照相关实现,编排相应模型
5759
<br>
5860

5961
备注:由于本项目运行依赖部分内部实现,此开源工程暂时无法整体编译运行。
6062

6163
### 核心模块
6264

6365
```Shell
64-
├── src
65-
│ ├── api
66-
│ │ ├── bridge.ts # 原生 API 桥接层
67-
│ │ └── llm.ts # LLM 对话实现
68-
│ ├── pages
69-
│ │ └── entry # webview 总入口
70-
│ │ ├── components
71-
│ │ ├── context # 全局状态管理
72-
│ │ ├── index.css
73-
│ │ ├── index.tsx
74-
│ │ ├── routes # 组件级页面路由
75-
│ │ │ ├── confirm # 确认页面
76-
│ │ │ ├── recognition # 识别中页面
77-
│ │ │ └── recognition-result # 题目解析页面
78-
│ │ └── utils.ts
66+
├── backend # 后端代码
67+
│   ├── code
68+
│   │   ├── main.py # 智能体编排
69+
│   │   └── prompts.py # 提示词
70+
│   ├── poetry.lock
71+
│   ├── pyproject.toml
72+
│   └── run.sh # 启动脚本
73+
└── frontend # 前端代码
74+
├── src
75+
│   ├── agent
76+
│   │   └── index.ts
77+
│   ├── api
78+
│   │   ├── bridge.ts # 原生 API 桥接层
79+
│   │   └── llm.ts # LLM 对话实现
80+
│   ├── app.ts
81+
│   ├── pages
82+
│   │   └── entry # webview 总入口
83+
│   │   ├── components
84+
│   │   ├── context # 全局状态管理
85+
│   │   ├── index.css
86+
│   │   ├── index.tsx
87+
│   │   ├── routes # 组件级页面路由
88+
│   │   └── utils.ts
89+
```
90+
91+
## 后端模型编排实现
92+
main.py 实现对于Doubao-1.5-vision-pro-32k模型和DeepSeek R1模型的编排,利用视觉模型识别题目内容,利用DeepSeek模型进行逻辑推理,生成答案。
93+
```python
94+
doubao_vlm = BaseChatLanguageModel(
95+
endpoint_id=DOUBAO_VLM_ENDPOINT,
96+
messages=[
97+
ArkMessage(
98+
role="system",
99+
content=vlm_prompt,
100+
)
101+
]
102+
+ request.messages,
103+
parameters=parameters,
104+
)
105+
vlm_usage_chunk = None
106+
vlm_content = ""
107+
async for chunk in doubao_vlm.astream():
108+
if chunk.usage:
109+
vlm_usage_chunk = chunk
110+
if len(chunk.choices) > 0 and chunk.choices[0].delta.content:
111+
yield chunk
112+
vlm_content += chunk.choices[0].delta.content
113+
deepseek = BaseChatLanguageModel(
114+
endpoint_id=DEEPSEEK_R1_ENDPOINT,
115+
messages=[
116+
ArkMessage(
117+
role="user",
118+
content=r1_prompt + vlm_content,
119+
),
120+
],
121+
parameters=parameters,
122+
)
123+
async for chunk in deepseek.astream():
124+
if chunk.usage and vlm_usage_chunk:
125+
chunk.bot_usage = BotUsage(model_usage=[vlm_usage_chunk.usage, chunk.usage])
126+
chunk.usage = merge_usage(chunk.usage, vlm_usage_chunk.usage)
127+
yield chunk
79128
```
80129

81130
## 对话实现
@@ -282,30 +331,27 @@ export default definePage({
282331
## 目录结构
283332
```Bash
284333
.
285-
├── applet.config.ts
286-
├── package.json # 项目依赖包管理
287-
├── pnpm-lock.yaml
288-
├── postcss.config.cjs
289-
├── src
290-
│ ├── api
291-
│ │ ├── bridge.ts # 原生 API 桥接层
292-
│ │ └── llm.ts # LLM 对话实现
293-
│ ├── app.ts
294-
│ ├── components
295-
│ ├── images
296-
│ ├── pages
297-
│ │ └── entry # webview 总入口
298-
│ │ ├── components
299-
│ │ ├── context # 全局状态管理
300-
│ │ ├── index.css
301-
│ │ ├── index.tsx
302-
│ │ ├── routes # 组件级页面路由
303-
│ │ │ ├── confirm # 确认页面
304-
│ │ │ ├── recognition # 识别中页面
305-
│ │ │ └── recognition-result # 题目解析页面
306-
│ │ └── utils.ts
307-
│ └── types
308-
│ └── index.ts
309-
├── tailwind.config.js # tailwind 配置
310-
└── tsconfig.json
334+
├── backend # 后端代码
335+
│   ├── code
336+
│   │   ├── main.py # 智能体编排
337+
│   │   └── prompts.py # 提示词
338+
│   ├── poetry.lock
339+
│   ├── pyproject.toml
340+
│   └── run.sh # 启动脚本
341+
└── frontend # 前端代码
342+
├── src
343+
│   ├── agent
344+
│   │   └── index.ts
345+
│   ├── api
346+
│   │   ├── bridge.ts # 原生 API 桥接层
347+
│   │   └── llm.ts # LLM 对话实现
348+
│   ├── app.ts
349+
│   ├── pages
350+
│   │   └── entry # webview 总入口
351+
│   │   ├── components
352+
│   │   ├── context # 全局状态管理
353+
│   │   ├── index.css
354+
│   │   ├── index.tsx
355+
│   │   ├── routes # 组件级页面路由
356+
│   │   └── utils.ts
311357
```
5.6 MB
Loading

demohouse/teacher/backend/code/main.py renamed to demohouse/teacher_avatar/backend/code/main.py

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,10 +37,13 @@
3737
from arkitect.telemetry.trace import task
3838
from prompts import (
3939
VLM_PROMPT_SOLVE,
40+
VLM_PROMPT_EXTRACT_QUESTION_SOLVE,
4041
DEEPSEEK_R1_PROMPT_SOLVE,
4142
VLM_PROMPT_CORRECT,
43+
VLM_PROMPT_EXTRACT_QUESTION_CORRECT,
4244
DEEPSEEK_R1_PROMPT_CORRECT,
4345
DEEPSEEK_R1_PROMPT_CHAT,
46+
VLM_INTENTION,
4447
)
4548

4649
logger = logging.getLogger(__name__)
@@ -70,6 +73,23 @@ def merge_usage(usage1: CompletionUsage, usage2: CompletionUsage) -> CompletionU
7073
return usage
7174

7275

76+
async def intention(request: ArkChatRequest, parameters: ArkChatParameters) -> bool:
77+
doubao_vlm = BaseChatLanguageModel(
78+
endpoint_id=DOUBAO_VLM_ENDPOINT,
79+
messages=[
80+
ArkMessage(
81+
role="system",
82+
content=VLM_INTENTION,
83+
)
84+
]
85+
+ request.messages,
86+
parameters=parameters,
87+
)
88+
resp = await doubao_vlm.arun()
89+
# if the input question do not contain image, then deepseek r1 can be used to solve the problem
90+
return resp.choices[0].message.content == "否"
91+
92+
7393
@task()
7494
async def default_model_calling(
7595
request: ArkChatRequest,
@@ -90,11 +110,21 @@ async def default_model_calling(
90110
)
91111
async for chunk in deepseek.astream():
92112
yield chunk
93-
return
113+
114+
r1_prompt, vlm_prompt = "", ""
115+
use_ds = await intention(request, parameters)
116+
if request.metadata and request.metadata.get("mode") == "correct" and use_ds:
117+
vlm_prompt, r1_prompt = VLM_PROMPT_EXTRACT_QUESTION_CORRECT, DEEPSEEK_R1_PROMPT_CORRECT
118+
elif request.metadata and request.metadata.get("mode") == "solve" and use_ds:
119+
vlm_prompt, r1_prompt = VLM_PROMPT_EXTRACT_QUESTION_SOLVE, DEEPSEEK_R1_PROMPT_SOLVE
94120
elif request.metadata and request.metadata.get("mode") == "correct":
95-
vlm_prompt, r1_prompt = VLM_PROMPT_CORRECT, DEEPSEEK_R1_PROMPT_CORRECT
121+
vlm_prompt = VLM_PROMPT_CORRECT
96122
else:
97-
vlm_prompt, r1_prompt = VLM_PROMPT_SOLVE, DEEPSEEK_R1_PROMPT_SOLVE
123+
vlm_prompt = VLM_PROMPT_SOLVE
124+
125+
# for math problems set temperature to zero https://api-docs.deepseek.com/quick_start/parameter_settings
126+
parameters.temperature = 0
127+
98128
doubao_vlm = BaseChatLanguageModel(
99129
endpoint_id=DOUBAO_VLM_ENDPOINT,
100130
messages=[
@@ -114,8 +144,8 @@ async def default_model_calling(
114144
if len(chunk.choices) > 0 and chunk.choices[0].delta.content:
115145
yield chunk
116146
vlm_content += chunk.choices[0].delta.content
117-
# for math problems set temperature to zero https://api-docs.deepseek.com/quick_start/parameter_settings
118-
parameters.temperature = 0
147+
if not use_ds:
148+
return
119149
deepseek = BaseChatLanguageModel(
120150
endpoint_id=DEEPSEEK_R1_ENDPOINT,
121151
messages=[

0 commit comments

Comments
 (0)