Skip to content

Commit bc06b2c

Browse files
committed
add post
1 parent f9218d3 commit bc06b2c

File tree

3 files changed

+279
-0
lines changed

3 files changed

+279
-0
lines changed
428 KB
Loading
Lines changed: 279 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
---
2+
title: "Wrenai Local Llm Usage Guide"
3+
date: 2025-03-16T17:28:10+08:00
4+
draft: false
5+
tags:
6+
-wrenAI
7+
-llm
8+
---
9+
10+
> Open-source GenBI AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, and BI.
11+
12+
WrenAI 是一个开源的Text-SQL 的工具,通过导入数据库结构,通过提问的方式生成SQL。
13+
14+
![](wren_workflow.png)
15+
16+
出于安全考虑,我们使用本地llm模型进行部署。
17+
18+
### 部署ollama
19+
20+
参考安装文档:https://hub.docker.com/r/ollama/ollama
21+
22+
```
23+
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
24+
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
25+
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
26+
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
27+
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
28+
sudo apt-get update
29+
30+
sudo apt-get install -y nvidia-container-toolkit
31+
32+
sudo nvidia-ctk runtime configure --runtime=docker
33+
sudo systemctl restart docker
34+
35+
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
36+
```
37+
38+
部署对应模型
39+
40+
```
41+
docker exec -it ollama ollama run nomic-embed-text:latest
42+
docker exec -it ollama ollama run phi4:14b
43+
```
44+
45+
部署完成后,需要在安全组里放开11434端口访问。
46+
47+
### 部署WrenAI
48+
49+
参考官方文档:https://docs.getwren.ai/oss/installation/custom_llm
50+
51+
创建本地配置目录
52+
53+
```
54+
mkdir -p ~/.wrenai
55+
```
56+
57+
配置目录下新增.env文件,内容如下:
58+
59+
```
60+
COMPOSE_PROJECT_NAME=wrenai
61+
PLATFORM=linux/amd64
62+
63+
PROJECT_DIR=/root/.wrenai
64+
65+
# service port
66+
WREN_ENGINE_PORT=8080
67+
WREN_ENGINE_SQL_PORT=7432
68+
WREN_AI_SERVICE_PORT=5555
69+
WREN_UI_PORT=3000
70+
IBIS_SERVER_PORT=8000
71+
WREN_UI_ENDPOINT=http://wren-ui:${WREN_UI_PORT}
72+
73+
LLM_PROVIDER=litellm_llm
74+
GENERATION_MODEL=phi4:14b // 自定义LLM模型
75+
LLM_OLLAMA_URL=http://部署机器IP:11434
76+
EMBEDDER_OLLAMA_URL=http://部署机器IP:11434
77+
78+
OPENAI_API_KEY=sk-*****
79+
80+
EMBEDDER_PROVIDER=litellm_embedder
81+
EMBEDDING_MODEL=nomic-embed-text // embedding模型
82+
EMBEDDING_MODEL_DIMENSION=768
83+
84+
# ai service settings
85+
QDRANT_HOST=qdrant
86+
SHOULD_FORCE_DEPLOY=1
87+
88+
# vendor keys
89+
LLM_OPENAI_API_KEY=
90+
EMBEDDER_OPENAI_API_KEY=
91+
LLM_AZURE_OPENAI_API_KEY=
92+
EMBEDDER_AZURE_OPENAI_API_KEY=
93+
QDRANT_API_KEY=
94+
95+
# version
96+
# CHANGE THIS TO THE LATEST VERSION
97+
WREN_PRODUCT_VERSION=0.15.3
98+
WREN_ENGINE_VERSION=0.13.1
99+
WREN_AI_SERVICE_VERSION=0.15.9
100+
IBIS_SERVER_VERSION=0.13.1
101+
WREN_UI_VERSION=0.20.1
102+
WREN_BOOTSTRAP_VERSION=0.1.5
103+
104+
# user id (uuid v4)
105+
USER_UUID=
106+
107+
# for other services
108+
POSTHOG_API_KEY=phc_nhF32aj4xHXOZb0oqr2cn4Oy9uiWzz6CCP4KZmRq9aE
109+
POSTHOG_HOST=https://app.posthog.com
110+
TELEMETRY_ENABLED=true
111+
# this is for telemetry to know the model, i think ai-service might be able to provide a endpoint to get the information
112+
#GENERATION_MODEL=gpt-4o-mini
113+
LANGFUSE_SECRET_KEY=
114+
LANGFUSE_PUBLIC_KEY=
115+
116+
# the port exposes to the host
117+
# OPTIONAL: change the port if you have a conflict
118+
HOST_PORT=3000
119+
AI_SERVICE_FORWARD_PORT=5555
120+
121+
# Wren UI
122+
EXPERIMENTAL_ENGINE_RUST_VERSION=false
123+
```
124+
125+
配置目录下新增config.yaml文件,内容如下:
126+
127+
```
128+
# you should rename this file to config.yaml and put it in ~/.wrenai
129+
# please pay attention to the comments starting with # and adjust the config accordingly
130+
131+
type: llm
132+
provider: litellm_llm
133+
timeout: 600
134+
models:
135+
- api_base: http://部署机器IP:11434/v1 # change this to your ollama host, api_base should be <ollama_url>/v1
136+
model: openai/phi4:14b # openai/<ollama_model_name>
137+
kwargs:
138+
n: 1
139+
temperature: 0
140+
141+
---
142+
type: embedder
143+
provider: litellm_embedder
144+
models:
145+
- model: openai/nomic-embed-text # put your ollama embedder model name here
146+
api_base: http://部署机器IP:11434/v1 # change this to your ollama host, url should be <ollama_url>
147+
timeout: 120 # 如果是CPU模式,需要调大这个超时时间
148+
149+
---
150+
type: engine
151+
provider: wren_ui
152+
endpoint: http://wren-ui:3000
153+
154+
---
155+
type: document_store
156+
provider: qdrant
157+
location: http://qdrant:6333
158+
embedding_model_dim: 768 # put your embedding model dimension here
159+
timeout: 120
160+
recreate_index: false
161+
162+
---
163+
# the format of llm and embedder should be <provider>.<model_name> such as litellm_llm.gpt-4o-2024-08-06
164+
# the pipes may be not the latest version, please refer to the latest version: https://raw.githubusercontent.com/canner/WrenAI/<WRENAI_VERSION_NUMBER>/docker/config.example.yaml
165+
type: pipeline
166+
pipes:
167+
- name: db_schema_indexing
168+
embedder: litellm_embedder.openai/nomic-embed-text
169+
document_store: qdrant
170+
- name: historical_question_indexing
171+
embedder: litellm_embedder.openai/nomic-embed-text
172+
document_store: qdrant
173+
- name: table_description_indexing
174+
embedder: litellm_embedder.openai/nomic-embed-text
175+
document_store: qdrant
176+
- name: db_schema_retrieval
177+
llm: litellm_llm.openai/phi4:14b
178+
embedder: litellm_embedder.openai/nomic-embed-text
179+
document_store: qdrant
180+
- name: historical_question_retrieval
181+
embedder: litellm_embedder.openai/nomic-embed-text
182+
document_store: qdrant
183+
- name: sql_generation
184+
llm: litellm_llm.openai/phi4:14b
185+
engine: wren_ui
186+
- name: sql_correction
187+
llm: litellm_llm.openai/phi4:14b
188+
engine: wren_ui
189+
- name: followup_sql_generation
190+
llm: litellm_llm.openai/phi4:14b
191+
engine: wren_ui
192+
- name: sql_summary
193+
llm: litellm_llm.openai/phi4:14b
194+
- name: sql_answer
195+
llm: litellm_llm.openai/phi4:14b
196+
engine: wren_ui
197+
- name: sql_breakdown
198+
llm: litellm_llm.openai/phi4:14b
199+
engine: wren_ui
200+
- name: sql_expansion
201+
llm: litellm_llm.openai/phi4:14b
202+
engine: wren_ui
203+
- name: sql_explanation
204+
llm: litellm_llm.openai/phi4:14b
205+
- name: sql_regeneration
206+
llm: litellm_llm.openai/phi4:14b
207+
engine: wren_ui
208+
- name: semantics_description
209+
llm: litellm_llm.openai/phi4:14b
210+
- name: relationship_recommendation
211+
llm: litellm_llm.openai/phi4:14b
212+
engine: wren_ui
213+
- name: question_recommendation
214+
llm: litellm_llm.openai/phi4:14b
215+
- name: question_recommendation_db_schema_retrieval
216+
llm: litellm_llm.openai/phi4:14b
217+
embedder: litellm_embedder.openai/nomic-embed-text
218+
document_store: qdrant
219+
- name: question_recommendation_sql_generation
220+
llm: litellm_llm.openai/phi4:14b
221+
engine: wren_ui
222+
- name: chart_generation
223+
llm: litellm_llm.openai/phi4:14b
224+
- name: chart_adjustment
225+
llm: litellm_llm.openai/phi4:14b
226+
- name: intent_classification
227+
llm: litellm_llm.openai/phi4:14b
228+
embedder: litellm_embedder.openai/nomic-embed-text
229+
document_store: qdrant
230+
- name: data_assistance
231+
llm: litellm_llm.openai/phi4:14b
232+
- name: sql_pairs_indexing
233+
document_store: qdrant
234+
embedder: litellm_embedder.openai/nomic-embed-text
235+
- name: sql_pairs_deletion
236+
document_store: qdrant
237+
embedder: litellm_embedder.openai/nomic-embed-text
238+
- name: sql_pairs_retrieval
239+
document_store: qdrant
240+
embedder: litellm_embedder.openai/nomic-embed-text
241+
llm: litellm_llm.openai/phi4:14b
242+
- name: preprocess_sql_data
243+
llm: litellm_llm.openai/phi4:14b
244+
- name: sql_executor
245+
engine: wren_ui
246+
- name: sql_question_generation
247+
llm: litellm_llm.openai/phi4:14b
248+
- name: sql_generation_reasoning
249+
llm: litellm_llm.openai/phi4:14b
250+
251+
---
252+
settings:
253+
column_indexing_batch_size: 50
254+
table_retrieval_size: 10
255+
table_column_retrieval_size: 100
256+
allow_using_db_schemas_without_pruning: false
257+
query_cache_maxsize: 1000
258+
query_cache_ttl: 3600
259+
langfuse_host: https://cloud.langfuse.com
260+
langfuse_enable: true
261+
logging_level: DEBUG
262+
development: true
263+
```
264+
265+
下载部署shell,执行安装:
266+
https://docs.getwren.ai/oss/installation#using-wren-ai-launcher
267+
268+
```
269+
curl -L https://github.com/Canner/WrenAI/releases/latest/download/wren-launcher-linux.tar.gz | tar -xz && ./wren-launcher-linux
270+
```
271+
272+
选择Custom模式,点击确定,部署成功。
273+
274+
![](deploy_wrenai.png)
275+
276+
记得防火墙放通3000端口访问。
277+
278+
部署完成后,通过浏览器访问http://部署机器IP:3000访问WrenAI服务。
279+
390 KB
Loading

0 commit comments

Comments
 (0)