You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guide/low-level-api.md
+78-7Lines changed: 78 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,13 +38,25 @@ and you can pass no sampling options to avoid making any adjustments to the prob
38
38
It's best to avoid getting the full probabilities list unless you really need it,
39
39
as passing it to the JavaScript side can be slow.
40
40
41
+
### Context Shift {#context-shift}
42
+
When the context sequence is full and you want to evaluate more tokens onto it,
43
+
some tokens will have to be removed to make room for new ones to be added.
44
+
45
+
Ideally, you'd want to do that on your logic level, so you can control which content to keep and which to remove.
46
+
> All the high-level APIs of `node-llama-cpp`[automatically do that](./chat-context-shift.md).
47
+
48
+
If you don't do that, `node-llama-cpp` will automatically remove the oldest tokens from the context sequence state to make room for new ones.
49
+
50
+
You can customize the context shift strategy `node-llama-cpp` uses for the context sequence by configuring the [`contextShift`](../api/classes/LlamaContext.md#parameters) option when calling [`.getSequence(...)`](../api/classes/LlamaContext.md#getsequence),
51
+
or by passing a customized the [`contextShift`](../api/type-aliases/SequenceEvaluateOptions#contextshift) option to the evaluation method you use.
52
+
41
53
## Simple Evaluation {#simple-evaluation}
42
-
You can evaluate the given input tokens onto a context sequence using [`.evaluate`](../api/classes/LlamaContextSequence.md#evaluate)
54
+
You can evaluate the given input tokens onto a context sequence using [`.evaluate(...)`](../api/classes/LlamaContextSequence.md#evaluate)
43
55
and generate the next token for the last input token.
44
56
45
57
On each iteration of the returned iterator, the generated token is then added to the context sequence state and the next token is generated for it, and so on.
46
58
47
-
When using [`.evaluate`](../api/classes/LlamaContextSequence.md#evaluate), the configured [token predictor](./token-prediction.md) is used to speed up the generation process.
59
+
When using [`.evaluate(...)`](../api/classes/LlamaContextSequence.md#evaluate), the configured [token predictor](./token-prediction.md) is used to speed up the generation process.
> If you want to adjust the token probabilities when generating output, consider using [token bias](./token-bias.md) instead
132
144
145
+
### With Metadata {#evaluation-with-metadata}
146
+
You can use [`.evaluateWithMetadata(...)`](../api/classes/LlamaContextSequence.md#evaluatewithmetadata) to evaluate tokens onto the context sequence state like [`.evaluate(...)`](#simple-evaluation), but with metadata emitted for each token.
To manually control for which of the input tokens to generate output, you can use [`.controlledEvaluate`](../api/classes/LlamaContextSequence.md#controlledevaluate).
227
+
To manually control for which of the input tokens to generate output,
228
+
you can use [`.controlledEvaluate(...)`](../api/classes/LlamaContextSequence.md#controlledevaluate).
158
229
159
230
```typescript
160
231
import {fileURLToPath} from"url";
@@ -179,8 +250,8 @@ const lastToken = evaluateInput.pop() as Token;
179
250
if (lastToken!=null)
180
251
evaluateInput.push([lastToken, {
181
252
generateNext: {
182
-
singleToken: true,
183
-
probabilitiesList: true,
253
+
token: true,
254
+
probabilities: true,
184
255
options: {
185
256
temperature: 0.8
186
257
}
@@ -222,7 +293,7 @@ as it may lead to unexpected results.
222
293
223
294
### Erase State Ranges {#erase-state-ranges}
224
295
To erase a range of tokens from the context sequence state,
225
-
you can use [`.eraseContextTokenRanges`](../api/classes/LlamaContextSequence.md#erasecontexttokenranges).
296
+
you can use [`.eraseContextTokenRanges(...)`](../api/classes/LlamaContextSequence.md#erasecontexttokenranges).
0 commit comments