You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(gpt-bot): Replace with SmolLM2 family and fix model switching
- Replace Qwen/Llama models with SmolLM2 family (135M, 360M, 1.7B)
- Fix model switching bug by calling dispose() before loading new model
- Add RAM detection (navigator.deviceMemory) to auto-select best model
- Dynamically populate model selector dropdown
- Update architecture specs and documentation for SmolLM2
**Approach:**Decoder-only transformer, optimized for browser/edge deployment
127
127
128
128
**How it works:**
129
-
- Transformer architecture (self-attention)
130
-
- Pre-trained on vast text corpora
131
-
- Fine-tuned for instruction following
132
-
- Few-shot learning capabilities
129
+
- Auto-selects model size based on your device RAM
130
+
- Available in 135M, 360M, and 1.7B parameter variants
131
+
- Pre-trained on FineWeb-Edu, code, and synthetic data
132
+
- Instruction-tuned for helpful conversations
133
+
- Runs entirely in your browser via WebGPU or WASM
133
134
134
135
**Famous for:**
135
-
- ChatGPT, GPT-4, Claude, Gemini
136
-
- Unprecedented fluency and coherence
137
-
- Reasoning and problem-solving
138
-
- Multi-turn conversations
136
+
- Best-in-class quality for browser-sized models
137
+
- Native Transformers.js support (ONNX bundled)
138
+
- Open-source (Apache 2.0)
139
139
140
140
**Capabilities:**
141
-
-Context understanding
142
-
-Knowledge integration
143
-
-Creative generation
144
-
-Task completion
141
+
-Multi-turn conversation with context preservation
142
+
-General knowledge and reasoning
143
+
-Code generation and explanation
144
+
-Helpful, instruction-following responses
145
145
146
146
## Features
147
147
@@ -237,7 +237,7 @@ An interactive journey through 60 years of conversational AI development, from E
237
237
- Initial model loading time in browser
238
238
239
239
**What Makes This Demo Special:**
240
-
This demo runs **real neural models** (BlenderBot 90M, Qwen2.5 0.5B) directly in your browser using Transformers.js. You'll experience genuine neural behavior - not simulations - allowing you to see the clear evolution from rule-based to learned conversation.
240
+
This demo runs **real neural models** (BlenderBot 90M, SmolLM2 135M-1.7B) directly in your browser using Transformers.js. The SmolLM2 model auto-selects based on your device RAM. You'll experience genuine neural behavior - not simulations - allowing you to see the clear evolution from rule-based to learned conversation.
241
241
242
242
## Historical Timeline
243
243
@@ -263,7 +263,7 @@ This demo runs **real neural models** (BlenderBot 90M, Qwen2.5 0.5B) directly in
263
263
-**PARRY**: State machine with emotional variables
264
264
-**A.L.I.C.E.**: ~40,000 AIML patterns
265
265
-**BlenderBot Small**: 90 million parameters (real neural model)
266
-
-**Qwen2.5 0.5B**: 500 million parameters (real neural model)
@@ -340,11 +340,13 @@ AIML-inspired pattern matching with improved context handling over ELIZA.
340
340
- Loads and runs entirely in your browser (may take 30-60 seconds initially)
341
341
- Fallback to DialoGPT-small if BlenderBot fails to load
342
342
343
-
### GPT / Qwen2.5
344
-
Uses Transformers.js with Qwen2.5-0.5B-Instruct (500M parameters) for actual neural text generation in the browser. Falls back to Llama-3.2-1B-Instruct if Qwen fails to load. Demonstrates modern transformer capabilities:
343
+
### SmolLM2
344
+
Uses Transformers.js with HuggingFace's SmolLM2 family (135M, 360M, 1.7B parameters) for actual neural text generation in the browser. Key features:
345
+
-**Auto-selects model** based on device RAM (navigator.deviceMemory API)
346
+
-**Proper model switching** with dispose() to prevent memory leaks
345
347
- Instruction-tuned for natural conversations
346
348
- Conversation history preserved across turns
347
-
- Runs entirely in-browser via WASM or WebGPU
349
+
- Runs entirely in-browser via WebGPU (preferred) or WASM fallback
348
350
- May take 30-60 seconds to download on first load
0 commit comments