Skip to content

Commit 8f25531

Browse files
author
ochafik
committed
tool-call: add basic usage example to server readme
1 parent 4706bdb commit 8f25531

File tree

1 file changed

+39
-0
lines changed

1 file changed

+39
-0
lines changed

examples/server/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ The project is under active development, and we are [looking for feedback and co
7272
| `--grammar GRAMMAR` | BNF-like grammar to constrain generations (see samples in grammars/ dir) (default: '') |
7373
| `--grammar-file FNAME` | file to read grammar from |
7474
| `-j, --json-schema SCHEMA` | JSON schema to constrain generations (https://json-schema.org/), e.g. `{}` for any JSON object<br/>For schemas w/ external $refs, use --grammar + example/json_schema_to_grammar.py instead |
75+
| `--jinja` | Enable (limited) Jinja templating engine, which is needed for tool use. |
7576
| `--rope-scaling {none,linear,yarn}` | RoPE frequency scaling method, defaults to linear unless specified by the model |
7677
| `--rope-scale N` | RoPE context scaling factor, expands context by a factor of N |
7778
| `--rope-freq-base N` | RoPE base frequency, used by NTK-aware scaling (default: loaded from model) |
@@ -505,6 +506,8 @@ Given a ChatML-formatted json description in `messages`, it returns the predicte
505506

506507
The `response_format` parameter supports both plain JSON output (e.g. `{"type": "json_object"}`) and schema-constrained JSON (e.g. `{"type": "json_object", "schema": {"type": "string", "minLength": 10, "maxLength": 100}}` or `{"type": "json_schema", "schema": {"properties": { "name": { "title": "Name", "type": "string" }, "date": { "title": "Date", "type": "string" }, "participants": { "items": {"type: "string" }, "title": "Participants", "type": "string" } } } }`), similar to other OpenAI-inspired API providers.
507508

509+
The `tools` / `tool_choice` parameters are only supported if the server is started with `--jinja`. The template included in the GGUF may not support tools, in that case you may want to override it w/ `--chat-template-file ...`.
510+
508511
*Examples:*
509512

510513
You can use either Python `openai` library with appropriate checkpoints:
@@ -549,6 +552,42 @@ Given a ChatML-formatted json description in `messages`, it returns the predicte
549552
}'
550553
```
551554

555+
... and even tool usage (needs `--jinja` flag):
556+
557+
```shell
558+
llama-server --jinja -hfr lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF -hff Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf -fa
559+
560+
curl http://localhost:8080/v1/chat/completions \
561+
-d '{
562+
"model": "gpt-3.5-turbo",
563+
"tools": [
564+
{
565+
"type": "function",
566+
"function": {
567+
"name": "ipython",
568+
"description": "Runs code in an ipython interpreter and returns the result of the execution after 60 seconds.",
569+
"parameters": {
570+
"type": "object",
571+
"properties": {
572+
"code": {
573+
"type": "string",
574+
"description": "The code to run in the ipython interpreter."
575+
}
576+
},
577+
"required": ["code"]
578+
}
579+
}
580+
}
581+
],
582+
"messages": [
583+
{
584+
"role": "user",
585+
"content": "Print a hello world message with python."
586+
}
587+
]
588+
}'
589+
```
590+
552591
### POST `/v1/embeddings`: OpenAI-compatible embeddings API
553592

554593
*Options:*

0 commit comments

Comments
 (0)