Skip to content

Commit 7828013

Browse files
committed
update docs
1 parent 74dc729 commit 7828013

File tree

1 file changed

+32
-7
lines changed

1 file changed

+32
-7
lines changed

examples/server/README.md

Lines changed: 32 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -343,6 +343,10 @@ node index.js
343343

344344
### POST `/completion`: Given a `prompt`, it returns the predicted completion.
345345

346+
> [!IMPORTANT]
347+
>
348+
> This endpoint is **not** OAI-compatible
349+
346350
*Options:*
347351

348352
`prompt`: Provide the prompt for this completion as a string or as an array of strings or numbers representing tokens. Internally, if `cache_prompt` is `true`, the prompt is compared to the previous completion and only the "unseen" suffix is evaluated. A `BOS` token is inserted at the start, if all of the following conditions are true:
@@ -448,27 +452,48 @@ These words will not be included in the completion, so make sure to add them to
448452

449453
- Note: When using streaming mode (`stream`), only `content` and `stop` will be returned until end of completion.
450454

451-
- `completion_probabilities`: An array of token probabilities for each completion. The array's length is `n_predict`. Each item in the array has the following structure:
455+
- `completion_probabilities`: An array of token probabilities for each completion. The array's length is `n_predict`. Each item in the array has a nested array `top_logprobs`. It contains at **maximum** `n_probs` elements:
452456

453457
```json
454458
{
455-
"content": "<the token selected by the model>",
456-
"probs": [
459+
"content": "<the generated completion text>",
460+
...
461+
"completion_probabilities": [
457462
{
463+
"id": <token id>,
458464
"prob": float,
459-
"tok_str": "<most likely token>"
465+
"token": "<most likely token>",
466+
"bytes": [int, int, ...],
467+
"top_logprobs": [
468+
{
469+
"id": <token id>,
470+
"prob": float,
471+
"token": "<token text>",
472+
"bytes": [int, int, ...],
473+
},
474+
{
475+
"id": <token id>,
476+
"prob": float,
477+
"token": "<token text>",
478+
"bytes": [int, int, ...],
479+
},
480+
...
481+
]
460482
},
461483
{
484+
"id": <token id>,
462485
"prob": float,
463-
"tok_str": "<second most likely token>"
486+
"token": "<most likely token>",
487+
"bytes": [int, int, ...],
488+
"top_logprobs": [
489+
...
490+
]
464491
},
465492
...
466493
]
467494
},
468495
```
469496

470-
Notice that each `probs` is an array of length `n_probs`.
471-
472497
- `content`: Completion result as a string (excluding `stopping_word` if any). In case of streaming mode, will contain the next token as a string.
473498
- `stop`: Boolean for use with `stream` to check whether the generation has stopped (Note: This is not related to stopping words array `stop` from input options)
474499
- `generation_settings`: The provided options above excluding `prompt` but including `n_ctx`, `model`. These options may differ from the original ones in some way (e.g. bad values filtered out, strings converted to tokens, etc.).

0 commit comments

Comments
 (0)