Skip to content

Commit f775951

Browse files
authored
Merge pull request #109 from second-state/alabulei1-patch-1
Create 2025-12-08-echokit-day-13-local-llm.md
2 parents 77456d9 + 50442df commit f775951

File tree

1 file changed

+116
-0
lines changed

1 file changed

+116
-0
lines changed
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
slug: echokit-30-days-day-13-local-llm
3+
title: "Day 13 — Running an LLM Locally for EchoKit | The First 30 Days with EchoKit"
4+
tags: [echokit30days]
5+
---
6+
7+
Over the last few days, we explored several cloud-based LLM providers — OpenAI, OpenRouter, and Grok. Each offers unique advantages, but today we’re doing something completely different: we’re running the open-source **Qwen3-4B** model *locally* and using it as EchoKit’s LLM provider.
8+
9+
10+
There’s no shortage of great open-source LLMs—Llama, Mistral, DeepSeek, Qwen, and many others—and you can pick whichever model best matches your use case.
11+
12+
Likewise, you can run a local model in several different ways. For today’s walkthrough, though, we’ll focus on a clean, lightweight, and portable setup:
13+
**Qwen3-4B (GGUF) running inside a WASM LLM server powered by WasmEdge.**
14+
This setup exposes an OpenAI-compatible API, which makes integrating it with EchoKit simple and seamless.
15+
16+
## Run the Qwen3-4B Model Locally
17+
18+
### Step 1 — Install WasmEdge
19+
20+
WasmEdge is a lightweight, secure WebAssembly runtime capable of running LLM workloads through the LlamaEdge extension.
21+
22+
Install it:
23+
24+
```bash
25+
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s
26+
```
27+
28+
Verify the installation:
29+
30+
```bash
31+
wasmedge --version
32+
```
33+
34+
You should see a version number printed.
35+
36+
37+
38+
### Step 2 — Download Qwen3-4B in GGUF Format
39+
40+
We’ll use a quantized version of Qwen3-4B, which keeps memory usage manageable while delivering strong performance.
41+
42+
```bash
43+
curl -Lo Qwen3-4B-Q5_K_M.gguf https://huggingface.co/second-state/Qwen3-4B-GGUF/resolve/main/Qwen3-4B-Q5_K_M.gguf
44+
```
45+
46+
47+
### Step 3 — Download the LlamaEdge API Server (WASM)
48+
49+
This small `.wasm` application loads GGUF models and exposes an **OpenAI-compatible chat API**, which EchoKit can connect to directly.
50+
51+
```bash
52+
curl -LO https://github.com/LlamaEdge/LlamaEdge/releases/latest/download/llama-api-server.wasm
53+
```
54+
55+
### Step 4 — Start the Local LLM Server
56+
57+
Now let’s launch the Qwen3-4B model locally and expose the `/v1/chat/completions` endpoint:
58+
59+
```bash
60+
wasmedge --dir .:. \
61+
--nn-preload default:GGML:AUTO:Qwen3-4B-Q5_K_M.gguf \
62+
llama-api-server.wasm \
63+
--model-name Qwen3-4B \
64+
--prompt-template qwen3-no-think \
65+
--ctx-size 4096
66+
```
67+
68+
If everything starts up correctly, the server will be available at:
69+
70+
```
71+
http://localhost:8080
72+
```
73+
74+
## Connect EchoKit to Your Local LLM
75+
76+
Open your EchoKit server’s `config.toml` and update the LLM settings:
77+
78+
```toml
79+
[llm]
80+
llm_chat_url = "http://localhost:8080/v1/chat/completions"
81+
api_key = "N/A"
82+
model = "Qwen3-4B"
83+
history = 5
84+
```
85+
86+
Save the file and restart your EchoKit server.
87+
88+
Next, pair your EchoKit device and connect it to your updated server.
89+
90+
Now try speaking to your device:
91+
92+
> “EchoKit, what do you think about running local models?”
93+
94+
Watch your terminal — you should see EchoKit sending requests to your local endpoint.
95+
96+
Your EchoKit is now fully powered by a local Qwen3-4B model.
97+
98+
Today we reached a major milestone:
99+
**EchoKit can now run entirely on your machine, with no external LLM provider required.**
100+
101+
---
102+
103+
This tutorial is only one small piece of what EchoKit can do.
104+
If you want to build your own voice AI device, try different LLMs, or run fully local models like Qwen — EchoKit gives you everything you need in one open-source kit.
105+
106+
Want to explore more or share what you’ve built?
107+
108+
* Join the **[EchoKit Discord](https://discord.gg/Fwe3zsT5g3)**
109+
* Show us your custom models, latency tests, and experiments — the community is growing fast.
110+
111+
Ready to get your own EchoKit?
112+
113+
* **EchoKit Box →** [https://echokit.dev/echokit_box.html](https://echokit.dev/echokit_box.html)
114+
* **EchoKit DIY Kit →** [https://echokit.dev/echokit_diy.html](https://echokit.dev/echokit_diy.html)
115+
116+
**Start building your own voice AI agent today.**

0 commit comments

Comments
 (0)