You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/routes/blog/post/chatbot-with-webllm-and-webgpu/+page.markdoc
+12-3Lines changed: 12 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ When you hear "LLM," you probably think of APIs, tokens, and cloud infrastructur
14
14
15
15
Local LLMs running inside the browser were nearly impossible just a year ago. But thanks to new technologies like WebLLM and WebGPU, you can now load a full language model into memory, run it on your device, and have a real-time conversation, all without a server.
16
16
17
-
In this guide, we'll build a local chatbot that runs entirely in the browser. No backend. No API keys. By the end, you should have a good understanding of WebLLM and WebGPU, and will have built an app that looks and functions like this:
17
+
In this guide, we'll build a local chatbot that runs entirely in the browser. No backend. No API keys. By the end, you should have a good understanding of [WebLLM](https://webllm.mlc.ai/) and [WebGPU](https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API), and will have built an app that looks and functions like this:
18
18
19
19

20
20
@@ -46,8 +46,15 @@ In our case, WebGPU lets the browser perform the heavy math required to generate
46
46
Here's what WebGPU does for us:
47
47
48
48
- **Performance**: Runs faster than JavaScript or even WebAssembly for these workloads
49
-
- **Accessibility**: Available in major browsers like Chrome, Edge, and Firefox (with a flag)
50
49
- **GPU-first**: Designed from the ground up for compute, not just rendering
50
+
- **Accessibility**: Available across different browsers, though support varies by platform. As of 2025:
51
+
52
+
- **Chrome/Edge**: Fully supported on Windows, Mac, and ChromeOS since version 113. On Linux, it requires enabling the `chrome://flags/#enable-unsafe-webgpu` flag
53
+
- **Firefox**: Available in Nightly builds by default, with stable release tentatively planned for Firefox 141
54
+
- **Safari**: Available in Safari Technology Preview, with support in iOS 18 and visionOS 2 betas via Feature Flags
55
+
- **Android**: Chrome 121+ supports WebGPU on Android
56
+
57
+
For production applications, you should include proper WebGPU feature detection and provide fallbacks for unsupported browsers.
51
58
52
59
Together, WebLLM and WebGPU allow us to do something powerful: load a quantized language model directly in the browser and have real-time chat without any backend server.
53
60
@@ -126,7 +133,7 @@ In the HTML file, we've created a chat interface with controls for model selecti
126
133
127
134
### Model selection
128
135
129
-
Notice that in the `div` with class `controls`, we have a `select` element for model selection and a `button` for loading the model. Here are the detailed specifications for each model:
136
+
Notice that in the `div` with class `controls`, we have a `select` element for model selection and a `button` for loading the model. Here are the specifications for each model:
This check is crucial because WebGPU availability varies significantly across browsers and platforms. The code will gracefully fail if WebGPU isn't available, allowing you to show appropriate fallback content to users.
0 commit comments