Skip to content

Commit 792b44f

Browse files
ExtReMLapinPierre F
andauthored
server : add documentation for parallel_tool_calls param (ggml-org#15647)
Co-authored-by: Pierre F <[email protected]>
1 parent 8101786 commit 792b44f

File tree

2 files changed

+4
-0
lines changed

2 files changed

+4
-0
lines changed

docs/function-calling.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ Function calling is supported for all models (see https://github.com/ggml-org/ll
2121
- Use `--chat-template-file` to override the template when appropriate (see examples below)
2222
- Generic support may consume more tokens and be less efficient than a model's native format.
2323

24+
- Multiple/parallel tool calling is supported on some models but disabled by default, enable it by passing `"parallel_tool_calls": true` in the completion endpoint payload.
25+
2426
<details>
2527
<summary>Show some common templates and which format handler they use</summary>
2628

tools/server/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1143,6 +1143,8 @@ The `response_format` parameter supports both plain JSON output (e.g. `{"type":
11431143

11441144
`parse_tool_calls`: Whether to parse the generated tool call.
11451145

1146+
`parallel_tool_calls` : Whether to enable parallel/multiple tool calls (only supported on some models, verification is based on jinja template).
1147+
11461148
*Examples:*
11471149

11481150
You can use either Python `openai` library with appropriate checkpoints:

0 commit comments

Comments
 (0)