You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: chat session response prefix
* feat: improve context shift strategy
* feat: use RAM and swap sizes in memory usage estimations
* feat(`inspect gguf` command): print a single key flag
* feat: faster building from source
* fix: Electron crash with some models on macOS when not using Metal
* fix: adapt to `llama.cpp` breaking changes
* fix: improve CPU compatibility score
Copy file name to clipboardExpand all lines: .github/ISSUE_TEMPLATE/bug-report.yml
+4-6Lines changed: 4 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -35,11 +35,10 @@ body:
35
35
id: steps
36
36
attributes:
37
37
label: Steps to reproduce
38
-
description: >-
38
+
description: |-
39
39
Your bug can be investigated much faster if your code can be run without any dependencies other than `node-llama-cpp`.
40
40
Issues without reproduction steps or code examples may be closed as not actionable.
41
-
Please try to provide a Minimal, Complete, and Verifiable example ([link](http://stackoverflow.com/help/mcve)).
42
-
Please include a link to the model file you used if possible.
41
+
Please try to provide a Minimal, Complete, and Verifiable example ([link](http://stackoverflow.com/help/mcve)), including a link to the model file you used if possible.
43
42
Also, please enable enable debug logs by using `getLlama({debug: true})` to get more information.
44
43
placeholder: >-
45
44
Please try to provide a Minimal, Complete, and Verifiable example.
@@ -50,10 +49,9 @@ body:
50
49
id: env
51
50
attributes:
52
51
label: My Environment
53
-
description: >-
52
+
description: |-
54
53
Please include the result of the command `npx --yes node-llama-cpp inspect gpu`.
55
-
Please also add any other relevant dependencies to this table at the end.
56
-
For example: Electron, Bun, Webpack.
54
+
Please also add any other relevant dependencies to this table at the end. For example: Electron, Bun, Webpack.
Copy file name to clipboardExpand all lines: docs/guide/electron.md
+24Lines changed: 24 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,3 +37,27 @@ so that `node-llama-cpp` can find them.
37
37
Cross packaging from one platform to another is not supported, since binaries for other platforms are not downloaded to you machine when your run `npm install`.
38
38
39
39
Packaging an `arm64` app on an `x64` machine is supported, but packaging an `x64` app on an `arm64` machine is not.
40
+
41
+
## Bundling
42
+
When bundling your code for Electron using [Electron Vite](https://electron-vite.org) or Webpack,
43
+
ensure that `node-llama-cpp` is not bundled, and is instead treated as an external module.
44
+
45
+
Marking `node-llama-cpp` as an external module will prevent its code from being bundled with your application code,
46
+
and instead, it'll be loaded from the `node_modules` directory at runtime (which should be packed into a `.asar` archive).
47
+
48
+
The file structure of `node-llama-cpp` is crucial for it to function correctly,
49
+
so bundling it will break its functionality.
50
+
Moreover, since `node-llama-cpp` includes prebuilt binaries (and also local builds from source),
51
+
those files must be retained in their original structure for it to work.
52
+
53
+
Electron has [its own bundling solution called ASAR](https://www.electronjs.org/docs/latest/tutorial/asar-archives) that is designed to work with node modules.
54
+
ASAR retains the original file structure of node modules by packing all the files into a single `.asar` archive file that Electron will read from at runtime like it would from the file system.
55
+
This method ensures node modules work as intended in Electron applications, even though they are bundled into a single file.
56
+
57
+
Using ASAR is the recommended way to bundle `node-llama-cpp` in your Electron app.
58
+
59
+
If you're using the scaffolded Electron app, this is already taken care of.
60
+
61
+
::: tip NOTE
62
+
We recommend using [Electron Vite](https://electron-vite.org) over Webpack for your Electron app due to to Vite's speed and Webpack's lack of proper ESM support in the output bundle, which complicates the bundling process.
Now, just use `node-llama-cpp` as you normally would.
88
+
89
+
## Intel AMX {#intel-amx}
90
+
> Intel AMX (Advanced Matrix Extensions) is a dedicated hardware block found on Intel Xeon processors
91
+
> that helps optimize and accelerate matrix multiplication operations.
92
+
>
93
+
> It's available on the 4th Gen and newer Intel Xeon processors.
94
+
95
+
Intel AMX can improve CPU inference performance [by 2x and up to even 14x](https://github.com/ggerganov/llama.cpp/pull/7707) faster inference times on supported CPUs (on specific conditions).
96
+
97
+
If you're using a 4th Gen or newer Intel Xeon processor,
98
+
you might want to [build `llama.cpp` from source](./building-from-source.md) to utilize these hardware-specific optimizations available on your hardware.
99
+
100
+
To do this, run this command inside your project on the machine you run your project on:
101
+
```shell
102
+
npx --no node-llama-cpp source download
103
+
```
104
+
105
+
Alternatively, you can force `node-llama-cpp` to not use its prebuilt binaries
106
+
and instead build from source when calling [`getLlama`](../api/functions/getLlama.md) for the first time on a Xeon CPU:
0 commit comments