Skip to content

Commit 15380b4

Browse files
committed
docs: Update session notes with Qwen3 integration details
1 parent 4afdb4d commit 15380b4

File tree

1 file changed

+50
-0
lines changed

1 file changed

+50
-0
lines changed

notes/2026-01-03-neural-model-demo-improvements.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,3 +83,53 @@ The demo should now:
8383
- Consider updating to `@huggingface/transformers` v3+ for better SmolLM support
8484
- First-time model loading can take 30-60 seconds depending on connection
8585
- Models are cached in browser after first load
86+
87+
---
88+
89+
## Session Update: Qwen3-1.7B Integration (Evening)
90+
91+
**Commit**: 4afdb4d (pushed to main)
92+
93+
### Changes Made
94+
95+
#### 1. gpt-bot.js (Completely Rewritten)
96+
- Replaced LaMini-T5 with **Qwen3-1.7B** (onnx-community/Qwen3-1.7B-ONNX)
97+
- Uses `@huggingface/transformers@3.5.1` for latest ONNX support
98+
- WebGPU acceleration with q4f16 quantization
99+
- Fallback to Qwen3-0.6B if 1.7B unavailable
100+
- Chat messages format with thinking budget control (`enable_thinking: false`)
101+
102+
#### 2. index.html
103+
- Updated timeline label: "2025 Qwen3"
104+
- Updated info cards with Qwen3 details and benchmarks
105+
- Updated chat placeholder and loading text
106+
- Updated architecture diagram (decoder-only with GQA, RoPE, etc.)
107+
- Added model specs table with benchmark scores
108+
109+
#### 3. timeline-app.js
110+
- Line 107: placeholder text "Talk to Qwen3..."
111+
- Line 754: botLabels `gpt: 'Qwen3 (2025)'`
112+
- Line 853: displayArchitecture `'Qwen3 (2025)': 'Input → Decoder-Only Transformer → Response'`
113+
114+
### Model Selection Rationale (Data-Driven)
115+
116+
| Model | MMLU | HumanEval | GSM8K |
117+
|-------|------|-----------|-------|
118+
| Qwen3-1.7B (primary) | 71.2 | 65.8% | 82.3 |
119+
| Qwen3-0.6B (fallback) | 59.4 | 42.1% | N/A |
120+
| Qwen2.5-3B (comparison) | 68.1 | N/A | N/A |
121+
122+
Qwen3-1.7B outperforms the larger Qwen2.5-3B on MMLU despite being smaller.
123+
124+
### Tests Verified
125+
- `npm run test:chatbot` - All tests passing:
126+
- ELIZA: 12/12
127+
- PARRY: All passing
128+
- ALICE: 41,380 patterns loaded, conversations working
129+
130+
### Previous Session Work (Earlier Today)
131+
- Implemented lazy loading for neural models
132+
- Fixed version mismatch between @huggingface/transformers and @xenova/transformers
133+
- Removed 300ms artificial delay on responses
134+
135+
### Status: COMPLETE

0 commit comments

Comments
 (0)