feat: Add SmolLM2 browser-based LLM inference via WebAssembly#2
Merged
feat: Add SmolLM2 browser-based LLM inference via WebAssembly#2
Conversation
Adding CLAUDE.md with task information for AI processing. This file will be removed when the task is complete. Issue: #1
This PR implements a proof of concept for running the SmolLM2-135M language model directly in the browser using WebAssembly, with no server-side processing. Key features: - Rust WASM library using Candle ML framework for model inference - Web Worker for background processing to keep UI responsive - React chat UI using @chatscope/chat-ui-kit-react - Local Rust development server with CORS support - GitHub Pages deployment workflow - Streaming token generation for real-time responses Architecture: - wasm/: Rust WASM bindings for SmolLM2 inference - web/: React frontend with TypeScript - server/: Local development server with Axum Fixes #1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…pace This fixes the wasm-pack build error where it thought it should be part of the parent workspace. An empty [workspace] table explicitly declares this package as its own standalone workspace. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The getrandom 0.3.x crate requires explicit configuration for the wasm32-unknown-unknown target. This adds: - .cargo/config.toml with rustflags to enable wasm_js backend - Updated Cargo.toml comments explaining the configuration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Both getrandom 0.2.x and 0.3.x are needed by different dependencies. Added explicit dependency on getrandom 0.3 with wasm_js feature to ensure proper WASM compilation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added Cache field to SmolLM2Model struct - Initialize cache during model loading - Pass cache to forward() method as required - Fix dim() error handling with explicit map_err 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These features are required for the wasm-bindgen output to be valid. Bulk memory operations are used by the generated WASM code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
wasm-pack doesn't always respect .cargo/config.toml settings. Setting RUSTFLAGS environment variable directly in the workflow ensures the bulk-memory feature is enabled during compilation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rust 1.87 / LLVM 20 generates bulk memory operations by default for wasm32-unknown-unknown targets. This causes wasm-opt to fail validation with error: "Bulk memory operations require bulk memory [--enable-bulk-memory]" Add wasm-pack profile configuration to pass --enable-bulk-memory and --enable-mutable-globals flags to wasm-opt during the optimization step. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Candle ML framework uses SIMD operations for optimized tensor computations. Add --enable-simd flag to wasm-opt and +simd128 target feature to Rust compiler flags to properly support these operations in WebAssembly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rust 1.87+ / LLVM 20 generates modern WebAssembly features by default: - nontrapping-float-to-int: For i32.trunc_sat_* saturating conversions - sign-ext: For sign extension operations - reference-types: For reference type operations Add these flags alongside existing bulk-memory, mutable-globals, and simd flags to pass wasm-opt validation successfully. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
TypeScript's strict mode requires an intermediate cast to unknown when converting between types that don't overlap. The WASM module's generated types don't exactly match our SmolLM2Wasm interface. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This reverts commit 4f47f7c.
Contributor
Author
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost estimation:
Now working session is ended, feel free to review and add any feedback on the solution draft. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements a proof-of-concept for running SmolLM2 language model entirely in the browser without any server-side processing, using WebAssembly for ML inference.
Key Features
Technical Implementation
wasm/ - Rust library compiled to WebAssembly
web/ - React/TypeScript frontend
server/ - Local Axum dev server for testing
Files Changed
wasm/web/server/Test Plan
Manual Testing
./scripts/dev.shto start local serverCloses #1
🤖 Generated with Claude Code