Releases: withcatai/node-llama-cpp
v2.8.11
v3.0.0-beta.22
3.0.0-beta.22 (2024-05-19)
Bug Fixes
Shipped with llama.cpp release b2929
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.21
3.0.0-beta.21 (2024-05-19)
Bug Fixes
Shipped with llama.cpp release b2929
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.20
3.0.0-beta.20 (2024-05-19)
Bug Fixes
Features
initcommand to scaffold a new project from a template (withnode-typescriptandelectron-typescript-reacttemplates) (#217) (d6a0f43)- debug mode (#217) (d6a0f43)
- load LoRA adapters (#217) (d6a0f43)
- improve Electron support (#217) (d6a0f43)
Shipped with llama.cpp release b2928
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.19
3.0.0-beta.19 (2024-05-12)
Bug Fixes
Features
Shipped with llama.cpp release b2861
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.18
3.0.0-beta.18 (2024-05-09)
Bug Fixes
- more efficient max context size finding algorithm (#214) (453c162)
- make embedding-only models work correctly (#214) (453c162)
- perform context shift on the correct token index on generation (#214) (453c162)
- make context loading work for all models on Electron (#214) (453c162)
Features
- split gguf files support (#214) (453c162)
pullcommand (#214) (453c162)stopOnAbortSignalandcustomStopTriggersonLlamaChatandLlamaChatSession(#214) (453c162)checkTensorsparameter onloadModel(#214) (453c162)- improve Electron support (#214) (453c162)
Shipped with llama.cpp release b2834
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v2.8.10
v3.0.0-beta.17
3.0.0-beta.17 (2024-04-24)
Bug Fixes
FunctionaryChatWrapperbugs (#205) (ef501f9)- function calling syntax bugs (#205) ([ef501f9]
- show
GPU layersin theModelline in CLI commands (#205) ([ef501f9] - refactor: rename
LlamaChatWrappertoLlama2ChatWrapper
Features
- Llama 3 support (#205) (ef501f9)
--gpuflag in generation CLI commands (#205) (ef501f9)specialTokensparameter onmodel.detokenize(#205) (ef501f9)
Shipped with llama.cpp release b2717
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.16
3.0.0-beta.16 (2024-04-13)
Bug Fixes
Features
inspect gpucommand: print device names (#198) (5ca33c7)inspect gpucommand: print env info (#202) (d332b77)- download models using the CLI (#191) (b542b53)
- interactively select a model from CLI commands (#191) (b542b53)
- change the default log level to warn (#191) (b542b53)
- token biases (#196) (3ad4494)
Shipped with llama.cpp release b2665
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)
v3.0.0-beta.15
3.0.0-beta.15 (2024-04-04)
Bug Fixes
- create a context with no parameters (#188) (6267778)
- improve chat wrappers tokenization (#182) (35e6f50)
- use the new
llama.cppCUDA flag (#182) (35e6f50) - adapt to breaking
llama.cppchanges (#183) (6b012a6)
Features
- automatically adapt to current free VRAM state (#182) (35e6f50)
inspect ggufcommand (#182) (35e6f50)inspect measurecommand (#182) (35e6f50)readGgufFileInfofunction (#182) (35e6f50)- GGUF file metadata info on
LlamaModel(#182) (35e6f50) JinjaTemplateChatWrapper(#182) (35e6f50)- use the
tokenizer.chat_templateheader from thegguffile when available - use it to find a better specialized chat wrapper or useJinjaTemplateChatWrapperwith it as a fallback (#182) (35e6f50) - simplify generation CLI commands:
chat,complete,infill(#182) (35e6f50) - Windows on Arm prebuilt binary (#181) (f3b7f81)
Shipped with llama.cpp release b2608
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)