3.0.0-beta.15 (2024-04-04)

automatically adapt to current free VRAM state (#182) (35e6f50)
inspect gguf command (#182) (35e6f50)
inspect measure command (#182) (35e6f50)
readGgufFileInfo function (#182) (35e6f50)
GGUF file metadata info on LlamaModel (#182) (35e6f50)
JinjaTemplateChatWrapper (#182) (35e6f50)
use the tokenizer.chat_template header from the gguf file when available - use it to find a better specialized chat wrapper or use JinjaTemplateChatWrapper with it as a fallback (#182) (35e6f50)
simplify generation CLI commands: chat, complete, infill (#182) (35e6f50)
Windows on Arm prebuilt binary (#181) (f3b7f81)

Shipped with llama.cpp release b2608

To use the latest llama.cpp release available, run npx --no node-llama-cpp download --release latest. (learn more)

Provide feedback