Skip to content

Releases: mlc-ai/web-llm

v0.2.81

18 Feb 21:43
31e8491

Choose a tag to compare

  • Engine and API overhaul: ChatModule refactored into Engine/MLCEngine, consolidated constructor/reload behavior, multi-model loading, better worker lifecycle, concurrency handling
  • OpenAI API: mirror chat/completions APIs, stateful options, function calling and embeddings support
  • Conversation templates: unified conversation template schema with custom templates
  • Expanded prebuilt model support: added support for more models (Llama 2/3/3.1/3.2, Mistral variants, Gemma 2, Qwen2/2.5/3, Phi family, including vision)
  • Runtime and caching: WebGPU performance/reliability improvements (more GPU-side kernels, better OOM/deviceLost handling), wasm/prebuilt versioning updates, support IndexedDB caching
  • XGrammar integration: JSON-schema/grammar-constrained generation, XGrammar structural tag
  • TVM-FFI integration: refactor for compatibility with more recent TVM commits and TVM FFI
  • Examples: ServiceWorkerEngine + updated Chrome extension demos, new RAG/doc-chat examples, tool calls via structural tag
  • CI: GitHub actions for linting and pre-commit hooks

Full Changelog: v0.2.0...v0.2.81

v0.2.0

26 May 12:52
8912a4d

Choose a tag to compare

  • Major typescript overhaul
  • npm package
  • Webworker support