Skip to content

Commit 5023aa8

Browse files
committed
readme : remove unnecessary indentation
Indenting a line with four spaces makes Markdown treat that section as plain text.
1 parent 8257c56 commit 5023aa8

File tree

1 file changed

+55
-55
lines changed

1 file changed

+55
-55
lines changed

examples/server/README.md

Lines changed: 55 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -318,100 +318,100 @@ node index.js
318318

319319
### POST `/completion`: Given a `prompt`, it returns the predicted completion.
320320

321-
*Options:*
321+
*Options:*
322322

323-
`prompt`: Provide the prompt for this completion as a string or as an array of strings or numbers representing tokens. Internally, if `cache_prompt` is `true`, the prompt is compared to the previous completion and only the "unseen" suffix is evaluated. A `BOS` token is inserted at the start, if all of the following conditions are true:
323+
`prompt`: Provide the prompt for this completion as a string or as an array of strings or numbers representing tokens. Internally, if `cache_prompt` is `true`, the prompt is compared to the previous completion and only the "unseen" suffix is evaluated. A `BOS` token is inserted at the start, if all of the following conditions are true:
324324

325-
- The prompt is a string or an array with the first element given as a string
326-
- The model's `tokenizer.ggml.add_bos_token` metadata is `true`
325+
- The prompt is a string or an array with the first element given as a string
326+
- The model's `tokenizer.ggml.add_bos_token` metadata is `true`
327327

328-
These input shapes and data type are allowed for `prompt`:
328+
These input shapes and data type are allowed for `prompt`:
329329

330-
- Single string: `"string"`
331-
- Single sequence of tokens: `[12, 34, 56]`
332-
- Mixed tokens and strings: `[12, 34, "string", 56, 78]`
330+
- Single string: `"string"`
331+
- Single sequence of tokens: `[12, 34, 56]`
332+
- Mixed tokens and strings: `[12, 34, "string", 56, 78]`
333333

334-
Multiple prompts are also supported. In this case, the completion result will be an array.
334+
Multiple prompts are also supported. In this case, the completion result will be an array.
335335

336-
- Only strings: `["string1", "string2"]`
337-
- Strings and sequences of tokens: `["string1", [12, 34, 56]]`
338-
- Mixed types: `[[12, 34, "string", 56, 78], [12, 34, 56], "string"]`
336+
- Only strings: `["string1", "string2"]`
337+
- Strings and sequences of tokens: `["string1", [12, 34, 56]]`
338+
- Mixed types: `[[12, 34, "string", 56, 78], [12, 34, 56], "string"]`
339339

340-
`temperature`: Adjust the randomness of the generated text. Default: `0.8`
340+
`temperature`: Adjust the randomness of the generated text. Default: `0.8`
341341

342-
`dynatemp_range`: Dynamic temperature range. The final temperature will be in the range of `[temperature - dynatemp_range; temperature + dynatemp_range]` Default: `0.0`, which is disabled.
342+
`dynatemp_range`: Dynamic temperature range. The final temperature will be in the range of `[temperature - dynatemp_range; temperature + dynatemp_range]` Default: `0.0`, which is disabled.
343343

344-
`dynatemp_exponent`: Dynamic temperature exponent. Default: `1.0`
344+
`dynatemp_exponent`: Dynamic temperature exponent. Default: `1.0`
345345

346-
`top_k`: Limit the next token selection to the K most probable tokens. Default: `40`
346+
`top_k`: Limit the next token selection to the K most probable tokens. Default: `40`
347347

348-
`top_p`: Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P. Default: `0.95`
348+
`top_p`: Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P. Default: `0.95`
349349

350-
`min_p`: The minimum probability for a token to be considered, relative to the probability of the most likely token. Default: `0.05`
350+
`min_p`: The minimum probability for a token to be considered, relative to the probability of the most likely token. Default: `0.05`
351351

352-
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. Default: `-1`, where `-1` is infinity.
352+
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. Default: `-1`, where `-1` is infinity.
353353

354-
`n_indent`: Specify the minimum line indentation for the generated text in number of whitespace characters. Useful for code completion tasks. Default: `0`
354+
`n_indent`: Specify the minimum line indentation for the generated text in number of whitespace characters. Useful for code completion tasks. Default: `0`
355355

356-
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. The number excludes the BOS token.
357-
By default, this value is set to `0`, meaning no tokens are kept. Use `-1` to retain all tokens from the prompt.
356+
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. The number excludes the BOS token.
357+
By default, this value is set to `0`, meaning no tokens are kept. Use `-1` to retain all tokens from the prompt.
358358

359-
`stream`: It allows receiving each predicted token in real-time instead of waiting for the completion to finish. To enable this, set to `true`.
359+
`stream`: It allows receiving each predicted token in real-time instead of waiting for the completion to finish. To enable this, set to `true`.
360360

361-
`stop`: Specify a JSON array of stopping strings.
362-
These words will not be included in the completion, so make sure to add them to the prompt for the next iteration. Default: `[]`
361+
`stop`: Specify a JSON array of stopping strings.
362+
These words will not be included in the completion, so make sure to add them to the prompt for the next iteration. Default: `[]`
363363

364-
`typical_p`: Enable locally typical sampling with parameter p. Default: `1.0`, which is disabled.
364+
`typical_p`: Enable locally typical sampling with parameter p. Default: `1.0`, which is disabled.
365365

366-
`repeat_penalty`: Control the repetition of token sequences in the generated text. Default: `1.1`
366+
`repeat_penalty`: Control the repetition of token sequences in the generated text. Default: `1.1`
367367

368-
`repeat_last_n`: Last n tokens to consider for penalizing repetition. Default: `64`, where `0` is disabled and `-1` is ctx-size.
368+
`repeat_last_n`: Last n tokens to consider for penalizing repetition. Default: `64`, where `0` is disabled and `-1` is ctx-size.
369369

370-
`penalize_nl`: Penalize newline tokens when applying the repeat penalty. Default: `true`
370+
`penalize_nl`: Penalize newline tokens when applying the repeat penalty. Default: `true`
371371

372-
`presence_penalty`: Repeat alpha presence penalty. Default: `0.0`, which is disabled.
372+
`presence_penalty`: Repeat alpha presence penalty. Default: `0.0`, which is disabled.
373373

374-
`frequency_penalty`: Repeat alpha frequency penalty. Default: `0.0`, which is disabled.
374+
`frequency_penalty`: Repeat alpha frequency penalty. Default: `0.0`, which is disabled.
375375

376-
`dry_multiplier`: Set the DRY (Don't Repeat Yourself) repetition penalty multiplier. Default: `0.0`, which is disabled.
376+
`dry_multiplier`: Set the DRY (Don't Repeat Yourself) repetition penalty multiplier. Default: `0.0`, which is disabled.
377377

378-
`dry_base`: Set the DRY repetition penalty base value. Default: `1.75`
378+
`dry_base`: Set the DRY repetition penalty base value. Default: `1.75`
379379

380-
`dry_allowed_length`: Tokens that extend repetition beyond this receive exponentially increasing penalty: multiplier * base ^ (length of repeating sequence before token - allowed length). Default: `2`
380+
`dry_allowed_length`: Tokens that extend repetition beyond this receive exponentially increasing penalty: multiplier * base ^ (length of repeating sequence before token - allowed length). Default: `2`
381381

382-
`dry_penalty_last_n`: How many tokens to scan for repetitions. Default: `-1`, where `0` is disabled and `-1` is context size.
382+
`dry_penalty_last_n`: How many tokens to scan for repetitions. Default: `-1`, where `0` is disabled and `-1` is context size.
383383

384-
`dry_sequence_breakers`: Specify an array of sequence breakers for DRY sampling. Only a JSON array of strings is accepted. Default: `['\n', ':', '"', '*']`
384+
`dry_sequence_breakers`: Specify an array of sequence breakers for DRY sampling. Only a JSON array of strings is accepted. Default: `['\n', ':', '"', '*']`
385385

386-
`mirostat`: Enable Mirostat sampling, controlling perplexity during text generation. Default: `0`, where `0` is disabled, `1` is Mirostat, and `2` is Mirostat 2.0.
386+
`mirostat`: Enable Mirostat sampling, controlling perplexity during text generation. Default: `0`, where `0` is disabled, `1` is Mirostat, and `2` is Mirostat 2.0.
387387

388-
`mirostat_tau`: Set the Mirostat target entropy, parameter tau. Default: `5.0`
388+
`mirostat_tau`: Set the Mirostat target entropy, parameter tau. Default: `5.0`
389389

390-
`mirostat_eta`: Set the Mirostat learning rate, parameter eta. Default: `0.1`
390+
`mirostat_eta`: Set the Mirostat learning rate, parameter eta. Default: `0.1`
391391

392-
`grammar`: Set grammar for grammar-based sampling. Default: no grammar
392+
`grammar`: Set grammar for grammar-based sampling. Default: no grammar
393393

394-
`json_schema`: Set a JSON schema for grammar-based sampling (e.g. `{"items": {"type": "string"}, "minItems": 10, "maxItems": 100}` of a list of strings, or `{}` for any JSON). See [tests](../../tests/test-json-schema-to-grammar.cpp) for supported features. Default: no JSON schema.
394+
`json_schema`: Set a JSON schema for grammar-based sampling (e.g. `{"items": {"type": "string"}, "minItems": 10, "maxItems": 100}` of a list of strings, or `{}` for any JSON). See [tests](../../tests/test-json-schema-to-grammar.cpp) for supported features. Default: no JSON schema.
395395

396-
`seed`: Set the random number generator (RNG) seed. Default: `-1`, which is a random seed.
396+
`seed`: Set the random number generator (RNG) seed. Default: `-1`, which is a random seed.
397397

398-
`ignore_eos`: Ignore end of stream token and continue generating. Default: `false`
398+
`ignore_eos`: Ignore end of stream token and continue generating. Default: `false`
399399

400-
`logit_bias`: Modify the likelihood of a token appearing in the generated text completion. For example, use `"logit_bias": [[15043,1.0]]` to increase the likelihood of the token 'Hello', or `"logit_bias": [[15043,-1.0]]` to decrease its likelihood. Setting the value to false, `"logit_bias": [[15043,false]]` ensures that the token `Hello` is never produced. The tokens can also be represented as strings, e.g. `[["Hello, World!",-0.5]]` will reduce the likelihood of all the individual tokens that represent the string `Hello, World!`, just like the `presence_penalty` does. Default: `[]`
400+
`logit_bias`: Modify the likelihood of a token appearing in the generated text completion. For example, use `"logit_bias": [[15043,1.0]]` to increase the likelihood of the token 'Hello', or `"logit_bias": [[15043,-1.0]]` to decrease its likelihood. Setting the value to false, `"logit_bias": [[15043,false]]` ensures that the token `Hello` is never produced. The tokens can also be represented as strings, e.g. `[["Hello, World!",-0.5]]` will reduce the likelihood of all the individual tokens that represent the string `Hello, World!`, just like the `presence_penalty` does. Default: `[]`
401401

402-
`n_probs`: If greater than 0, the response also contains the probabilities of top N tokens for each generated token given the sampling settings. Note that for temperature < 0 the tokens are sampled greedily but token probabilities are still being calculated via a simple softmax of the logits without considering any other sampler settings. Default: `0`
402+
`n_probs`: If greater than 0, the response also contains the probabilities of top N tokens for each generated token given the sampling settings. Note that for temperature < 0 the tokens are sampled greedily but token probabilities are still being calculated via a simple softmax of the logits without considering any other sampler settings. Default: `0`
403403

404-
`min_keep`: If greater than 0, force samplers to return N possible tokens at minimum. Default: `0`
404+
`min_keep`: If greater than 0, force samplers to return N possible tokens at minimum. Default: `0`
405405

406-
`t_max_predict_ms`: Set a time limit in milliseconds for the prediction (a.k.a. text-generation) phase. The timeout will trigger if the generation takes more than the specified time (measured since the first token was generated) and if a new-line character has already been generated. Useful for FIM applications. Default: `0`, which is disabled.
406+
`t_max_predict_ms`: Set a time limit in milliseconds for the prediction (a.k.a. text-generation) phase. The timeout will trigger if the generation takes more than the specified time (measured since the first token was generated) and if a new-line character has already been generated. Useful for FIM applications. Default: `0`, which is disabled.
407407

408-
`image_data`: An array of objects to hold base64-encoded image `data` and its `id`s to be reference in `prompt`. You can determine the place of the image in the prompt as in the following: `USER:[img-12]Describe the image in detail.\nASSISTANT:`. In this case, `[img-12]` will be replaced by the embeddings of the image with id `12` in the following `image_data` array: `{..., "image_data": [{"data": "<BASE64_STRING>", "id": 12}]}`. Use `image_data` only with multimodal models, e.g., LLaVA.
408+
`image_data`: An array of objects to hold base64-encoded image `data` and its `id`s to be reference in `prompt`. You can determine the place of the image in the prompt as in the following: `USER:[img-12]Describe the image in detail.\nASSISTANT:`. In this case, `[img-12]` will be replaced by the embeddings of the image with id `12` in the following `image_data` array: `{..., "image_data": [{"data": "<BASE64_STRING>", "id": 12}]}`. Use `image_data` only with multimodal models, e.g., LLaVA.
409409

410-
`id_slot`: Assign the completion task to an specific slot. If is -1 the task will be assigned to a Idle slot. Default: `-1`
410+
`id_slot`: Assign the completion task to an specific slot. If is -1 the task will be assigned to a Idle slot. Default: `-1`
411411

412-
`cache_prompt`: Re-use KV cache from a previous request if possible. This way the common prefix does not have to be re-processed, only the suffix that differs between the requests. Because (depending on the backend) the logits are **not** guaranteed to be bit-for-bit identical for different batch sizes (prompt processing vs. token generation) enabling this option can cause nondeterministic results. Default: `false`
412+
`cache_prompt`: Re-use KV cache from a previous request if possible. This way the common prefix does not have to be re-processed, only the suffix that differs between the requests. Because (depending on the backend) the logits are **not** guaranteed to be bit-for-bit identical for different batch sizes (prompt processing vs. token generation) enabling this option can cause nondeterministic results. Default: `false`
413413

414-
`samplers`: The order the samplers should be applied in. An array of strings representing sampler type names. If a sampler is not set, it will not be used. If a sampler is specified more than once, it will be applied multiple times. Default: `["top_k", "typical_p", "top_p", "min_p", "temperature"]` - these are all the available values.
414+
`samplers`: The order the samplers should be applied in. An array of strings representing sampler type names. If a sampler is not set, it will not be used. If a sampler is specified more than once, it will be applied multiple times. Default: `["top_k", "typical_p", "top_p", "min_p", "temperature"]` - these are all the available values.
415415

416416
**Response format**
417417

@@ -454,13 +454,13 @@ Notice that each `probs` is an array of length `n_probs`.
454454

455455
### POST `/tokenize`: Tokenize a given text
456456

457-
*Options:*
457+
*Options:*
458458

459-
`content`: (Required) The text to tokenize.
459+
`content`: (Required) The text to tokenize.
460460

461-
`add_special`: (Optional) Boolean indicating if special tokens, i.e. `BOS`, should be inserted. Default: `false`
461+
`add_special`: (Optional) Boolean indicating if special tokens, i.e. `BOS`, should be inserted. Default: `false`
462462

463-
`with_pieces`: (Optional) Boolean indicating whether to return token pieces along with IDs. Default: `false`
463+
`with_pieces`: (Optional) Boolean indicating whether to return token pieces along with IDs. Default: `false`
464464

465465
**Response:**
466466

0 commit comments

Comments
 (0)