Releases: withcatai/node-llama-cpp
v3.0.0-beta.45
3.0.0-beta.45 (2024-09-19)
Bug Fixes
- improve performance of parallel evaluation from multiple contexts (#309) (4b3ad61)
- Llama 3.1 chat wrapper standard chat history (#309) (4b3ad61)
- adapt to
llama.cpp
sampling refactor (#309) (4b3ad61) - Llama 3 Instruct function calling (#309) (4b3ad61)
- don't preload prompt in the
chat
command when using--printTimings
or--meter
(#309) (4b3ad61) - more stable Jinja template matching (#309) (4b3ad61)
Features
inspect estimate
command (#309) (4b3ad61)- move
seed
option to the prompt level (#309) (4b3ad61) - Functionary v3 support (#309) (4b3ad61)
- Mistral chat wrapper (#309) (4b3ad61)
- improve Llama 3.1 chat template detection (#309) (4b3ad61)
- change
autoDisposeSequence
default tofalse
(#309) (4b3ad61) - move
download
,build
andclear
commands to be subcommands of asource
command (#309) (4b3ad61) - simplify
TokenBias
(#309) (4b3ad61) - better
threads
default value (#309) (4b3ad61) - make
LlamaEmbedding
an object (#309) (4b3ad61) HF_TOKEN
env var support for reading GGUF file metadata (#309) (4b3ad61)TemplateChatWrapper
: custom history template for each message role (#309) (4b3ad61)- more helpful
inspect gpu
command (#309) (4b3ad61) - all tokenizer tokens iterator (#309) (4b3ad61)
- failed context creation automatic remedy (#309) (4b3ad61)
- abort generation support in CLI commands (#309) (4b3ad61)
--gpuLayers max
and--contextSize max
flag support forinspect estimate
command (#309) (4b3ad61)- extract all prebuilt binaries to external modules (#309) (4b3ad61)
- updated docs (#309) (4b3ad61)
- combine model downloaders (#309) (4b3ad61)
- feat(electron example template): update badge, scroll anchoring, table support (#309) (4b3ad61)
Shipped with llama.cpp
release b3785
To use the latest
llama.cpp
release available, runnpx -n node-llama-cpp source download --release latest
. (learn more)
v2.8.16
v3.0.0-beta.44
3.0.0-beta.44 (2024-08-10)
Bug Fixes
Shipped with llama.cpp
release b3543
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.43
3.0.0-beta.43 (2024-08-09)
Bug Fixes
Shipped with llama.cpp
release b3560
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.42
3.0.0-beta.42 (2024-08-07)
Bug Fixes
Shipped with llama.cpp
release b3541
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v2.8.15
v3.0.0-beta.41
3.0.0-beta.41 (2024-08-02)
Bug Fixes
Shipped with llama.cpp
release b3504
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.40
3.0.0-beta.40 (2024-07-30)
Bug Fixes
Features
Shipped with llama.cpp
release b3488
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.39
3.0.0-beta.39 (2024-07-28)
Bug Fixes
- Gemma chat wrapper bug (#273) (e3e0994)
- GGUF metadata nested key conflicts (#273) (e3e0994)
- adapt to
llama.cpp
breaking changes (#273) (e3e0994) - preserve function calling chunks (#273) (e3e0994)
- format JSON objects like models expect (#273) (e3e0994)
Features
- Llama 3.1 support (#273) (e3e0994)
- Phi-3 support (#273) (e3e0994)
- model metadata overrides (#273) (e3e0994)
- use LoRA on a context instead of on a model (#273) (e3e0994)
onTextChunk
option (#273) (e3e0994)
Shipped with llama.cpp
release b3479
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)