Skip to content

v3.0.0-beta.15

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 04 Apr 20:52
6267778

3.0.0-beta.15 (2024-04-04)

Bug Fixes

Features

  • automatically adapt to current free VRAM state (#182) (35e6f50)
  • inspect gguf command (#182) (35e6f50)
  • inspect measure command (#182) (35e6f50)
  • readGgufFileInfo function (#182) (35e6f50)
  • GGUF file metadata info on LlamaModel (#182) (35e6f50)
  • JinjaTemplateChatWrapper (#182) (35e6f50)
  • use the tokenizer.chat_template header from the gguf file when available - use it to find a better specialized chat wrapper or use JinjaTemplateChatWrapper with it as a fallback (#182) (35e6f50)
  • simplify generation CLI commands: chat, complete, infill (#182) (35e6f50)
  • Windows on Arm prebuilt binary (#181) (f3b7f81)

Shipped with llama.cpp release b2608

To use the latest llama.cpp release available, run npx --no node-llama-cpp download --release latest. (learn more)