Skip to content
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
11b5404
fix: adapt to breaking `llama.cpp` changes
giladgd May 11, 2025
8b98cf0
fix: improve GPU backend loading error description
giladgd May 11, 2025
1e8111c
chore: update template dependencies
giladgd May 11, 2025
2f9858a
test: Qwen 3 template
giladgd May 11, 2025
4c6e2b1
feat: configure Hugging Face remote endpoint for resolving URIs
giladgd May 11, 2025
d39d261
fix: race condition when reading extremely long gguf metadata
giladgd May 11, 2025
e740078
docs: typo
giladgd May 11, 2025
d6e852e
fix: update gguf types
giladgd May 11, 2025
9ab3c6d
fix: capture multi-token segment separators
giladgd May 11, 2025
656f2be
docs: solutions to more CUDA issues
giladgd May 11, 2025
6926425
feat: stream function call parameters
giladgd May 11, 2025
b369eaf
docs: update the awesome list
giladgd May 11, 2025
72c30dc
chore: update modules
giladgd May 11, 2025
df05d70
docs: more clear default values for custom cmake options
giladgd May 11, 2025
b3d510e
chore: reorder Vitepress config keys
giladgd May 11, 2025
3233603
fix: update gguf types
giladgd May 11, 2025
96c78da
docs: document new env vars
giladgd May 11, 2025
f7063d8
chore: module versions
giladgd May 12, 2025
123e524
chore: update GitHub issue templates
giladgd May 12, 2025
53a5206
test: check recommended model URIs
giladgd May 13, 2025
2e1a7ce
test: fix tests
giladgd May 14, 2025
9463ccc
feat(`QwenChatWrapper`): support discouraging the generation of thoughts
giladgd May 15, 2025
631a7e7
test: fix tests
giladgd May 15, 2025
a0cc198
feat: save and restore context sequence state
giladgd May 15, 2025
185b734
docs: save and restore context sequence state
giladgd May 15, 2025
d36670c
fix: adapt memory estimation to new added model architectures
giladgd May 15, 2025
a68590a
feat(`getLlama`): `dryRun` option
giladgd May 16, 2025
8c6134d
feat: `getLlamaGpuTypes` to get the list of available GPU types for t…
giladgd May 16, 2025
71babfa
fix: skip binary testing on certain problematic conditions
giladgd May 16, 2025
12cec69
docs: fix dead link
giladgd May 16, 2025
de3a360
fix: Paperspace tests setup script nodejs version
giladgd May 16, 2025
8eff306
fix: Windows build
giladgd May 17, 2025
f76e899
fix: types
giladgd May 17, 2025
0cbb572
test: fix tests
giladgd May 17, 2025
2c01084
fix: performance improvements
giladgd May 17, 2025
5d4c8c3
fix: remove unused files from the build dir
giladgd May 17, 2025
69d30cd
fix: remove unused line
giladgd May 17, 2025
62c8020
fix: performance improvements
giladgd May 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .vitepress/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -470,8 +470,6 @@ export default defineConfig({
}
},
sidebar: {
"/api/": getApiReferenceSidebar(),

"/guide/": [{
text: "Guide",
base: "/guide",
Expand Down Expand Up @@ -550,7 +548,9 @@ export default defineConfig({
]
}
]
}]
}],

"/api/": getApiReferenceSidebar()
},
socialLinks: [
{icon: "npm", link: "https://www.npmjs.com/package/node-llama-cpp"},
Expand Down
2 changes: 1 addition & 1 deletion docs/cli/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ If a file already exists and its size matches the expected size, it will not be

The supported URI schemes are:
- **HTTP:** `https://`, `http://`
- **Hugging Face:** `hf:<user>/<model>:<quant>` (`#<quant>` is optional, [but recommended](../guide/downloading-models.md#hf-scheme-specify-quant))
- **Hugging Face:** `hf:<user>/<model>:<quant>` (`:<quant>` is optional, [but recommended](../guide/downloading-models.md#hf-scheme-specify-quant))
- **Hugging Face:** `hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional)

Learn more about using model URIs in the [Downloading Models guide](../guide/downloading-models.md#model-uris).
Expand Down
27 changes: 27 additions & 0 deletions docs/guide/CUDA.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,33 @@ set NODE_LLAMA_CPP_CMAKE_OPTION_CMAKE_GENERATOR_TOOLSET=%CUDA_PATH%

Then run the build command again to check whether setting the `CMAKE_GENERATOR_TOOLSET` cmake option fixed the issue.

### Fix the `forward compatibility was attempted on non supported HW` Error {#fix-cuda-forward-compatibility}
This error usually happens when the CUDA version you have installed on your machine is older than the CUDA version used in the prebuilt binaries supplied by `node-llama-cpp`.

To resolve this issue, you can either [update your CUDA installation](https://developer.nvidia.com/cuda-downloads) to the latest version (recommended) or [build `node-llama-cpp` on your machine](#building) against the CUDA version you have installed.

### Fix the `Binary GPU type mismatch. Expected: cuda, got: false` Error {#fix-cuda-gpu-type-mismatch}
This error usually happens when you have multiple conflicting CUDA versions installed on your machine.

To fix it, uninstall older CUDA versions and restart your machine (important).

:::: details Check which CUDA libraries are picked up by `node-llama-cpp`'s prebuilt binaries on your machine

Run this command inside of your project:

::: code-group
```shell [Linux]
ldd ./node_modules/@node-llama-cpp/linux-x64-cuda/bins/linux-x64-cuda/libggml-cuda.so
```

```cmd [Windows]
"C:\Program Files\Git\usr\bin\ldd.exe" node_modules\@node-llama-cpp\win-x64-cuda\bins\win-x64-cuda\ggml-cuda.dll
```
:::

::::


## Using `node-llama-cpp` With CUDA
It's recommended to use [`getLlama`](../api/functions/getLlama) without specifying a GPU type,
so it'll detect the available GPU types and use the best one automatically.
Expand Down
19 changes: 17 additions & 2 deletions docs/guide/awesome.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,32 @@
description: Awesome projects that use node-llama-cpp
---
# Awesome `node-llama-cpp`
Awesome projects that use `node-llama-cpp`.
:sunglasses: Awesome projects that use `node-llama-cpp`.

<script setup lang="ts">
import DataBadge from "../../.vitepress/components/DataBadge/DataBadge.vue";
</script>

## Open Source
* [CatAI](https://github.com/withcatai/catai) - a simplified AI assistant API for Node.js, with REST API support
<br /><DataBadge title="License" content="MIT"/>

* [Manzoni](https://manzoni.app/) ([GitHub](https://github.com/gems-platforms/manzoni-app)) - a text editor running local LLMs
<br /><DataBadge title="License" content="AGPL-3.0"/>


## Proprietary
> List your project here!
* [BashBuddy](https://bashbuddy.run) ([GitHub](https://github.com/wosherco/bashbuddy)) - write bash commands with natural language
<br /><DataBadge title="Partially open source" content="Source available" href="https://github.com/wosherco/bashbuddy/blob/main/LICENSE.md"/>

* [nutshell](https://withnutshell.com) - Private AI meeting notes processed completely on your device



<br />

---

> To add a project to this list, [open a PR](https://github.com/withcatai/node-llama-cpp/edit/master/docs/guide/awesome.md).
>
> To have a project listed here, it should clearly state that it uses `node-llama-cpp`.
6 changes: 6 additions & 0 deletions docs/guide/cmakeOptions.data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,12 @@ function parseCmakeOptions(cmakeListsTxt: string, optionFilter: ((key: string) =
}
} else if (option.defaultValue === "${BUILD_SHARED_LIBS_DEFAULT}")
option.defaultValue = htmlEscapeWithCodeMarkdown("`OFF` on MinGW, `ON` otherwise");
else if (option.defaultValue === "${GGML_CUDA_GRAPHS_DEFAULT}")
option.defaultValue = htmlEscapeWithCodeMarkdown("`ON`");
else if (option.defaultValue === "${GGML_NATIVE_DEFAULT}")
option.defaultValue = htmlEscapeWithCodeMarkdown("`OFF` when building for a different architecture,\n`ON` otherwise");
else if (option.key === "LLAMA_CURL")
option.defaultValue = htmlEscapeWithCodeMarkdown("`OFF`");
else
option.defaultValue = htmlEscapeWithCodeMarkdown(
option.defaultValue != null
Expand Down
2 changes: 1 addition & 1 deletion docs/guide/downloading-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ You can reference models using a URI instead of their full download URL when usi
When downloading a model from a URI, the model files will be prefixed with a corresponding adaptation of the URI.

To reference a model from Hugging Face, you can use one of these schemes:
* `hf:<user>/<model>:<quant>` (`#<quant>` is optional, [but recommended](#hf-scheme-specify-quant))
* `hf:<user>/<model>:<quant>` (`:<quant>` is optional, [but recommended](#hf-scheme-specify-quant))
* `hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional)

Here are example usages of the Hugging Face URI scheme:
Expand Down
1 change: 1 addition & 0 deletions docs/guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,4 +316,5 @@ Explore the [API reference](../api/functions/getLlama.md) to learn more about th
and use the search bar (press <kbd class="doc-kbd">/</kbd>) to find documentation for a specific topic or API.

Check out the [roadmap](https://github.com/orgs/withcatai/projects/1) to see what's coming next,<br/>
visit the [awesome list](./awesome.md) to find great projects that use `node-llama-cpp`,<br/>
and consider [sponsoring `node-llama-cpp`](https://github.com/sponsors/giladgd) to accelerate the development of new features.
5 changes: 4 additions & 1 deletion llama/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
cmake_minimum_required(VERSION 3.14)
cmake_minimum_required(VERSION 3.19)

if (NLC_CURRENT_PLATFORM STREQUAL "win-x64" OR NLC_CURRENT_PLATFORM STREQUAL "win-arm64")
set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
Expand Down Expand Up @@ -70,6 +70,9 @@ add_subdirectory("llama.cpp")
include_directories("llama.cpp")
include_directories("./llama.cpp/common")

# This is needed to use methods in "llama-grammar.h" and "unicode.h"
target_include_directories(llama PUBLIC "./llama.cpp/src")

unset(GPU_INFO_HEADERS)
unset(GPU_INFO_SOURCES)
unset(GPU_INFO_EXTRA_LIBS)
Expand Down
1 change: 0 additions & 1 deletion llama/addon/AddonContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
#include <algorithm>
#include <cmath>
#include "common/common.h"
#include "llama-grammar.h"
#include "llama.h"

#include "addonGlobals.h"
Expand Down
1 change: 0 additions & 1 deletion llama/addon/AddonSampler.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
#include <cmath>
#include "common/common.h"
#include "llama-grammar.h"
#include "llama.h"

#include "AddonGrammarEvaluationState.h"
Expand Down
Loading
Loading