You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/server/README.md
+39Lines changed: 39 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,6 +72,7 @@ The project is under active development, and we are [looking for feedback and co
72
72
|`--grammar GRAMMAR`| BNF-like grammar to constrain generations (see samples in grammars/ dir) (default: '') |
73
73
|`--grammar-file FNAME`| file to read grammar from |
74
74
|`-j, --json-schema SCHEMA`| JSON schema to constrain generations (https://json-schema.org/), e.g. `{}` for any JSON object<br/>For schemas w/ external $refs, use --grammar + example/json_schema_to_grammar.py instead |
75
+
|`--jinja`| Enable (limited) Jinja templating engine, which is needed for tool use. |
75
76
|`--rope-scaling {none,linear,yarn}`| RoPE frequency scaling method, defaults to linear unless specified by the model |
76
77
|`--rope-scale N`| RoPE context scaling factor, expands context by a factor of N |
77
78
|`--rope-freq-base N`| RoPE base frequency, used by NTK-aware scaling (default: loaded from model) |
@@ -505,6 +506,8 @@ Given a ChatML-formatted json description in `messages`, it returns the predicte
505
506
506
507
The `response_format` parameter supports both plain JSON output (e.g. `{"type": "json_object"}`) and schema-constrained JSON (e.g. `{"type": "json_object", "schema": {"type": "string", "minLength": 10, "maxLength": 100}}` or `{"type": "json_schema", "schema": {"properties": { "name": { "title": "Name", "type": "string" }, "date": { "title": "Date", "type": "string" }, "participants": { "items": {"type: "string" }, "title": "Participants", "type": "string" } } } }`), similar to other OpenAI-inspired API providers.
507
508
509
+
The `tools` / `tool_choice` parameters are only supported if the server is started with `--jinja`. The template included in the GGUF may not support tools, in that case you may want to override it w/ `--chat-template-file ...`.
510
+
508
511
*Examples:*
509
512
510
513
You can use either Python `openai` library with appropriate checkpoints:
@@ -549,6 +552,42 @@ Given a ChatML-formatted json description in `messages`, it returns the predicte
0 commit comments