Skip to content

Commit 0007de8

Browse files
authored
proof guide.en-gb.md
1 parent 189fcc1 commit 0007de8

File tree

1 file changed

+45
-35
lines changed
  • pages/public_cloud/ai_machine_learning/endpoints_guide_06_function_calling

1 file changed

+45
-35
lines changed

pages/public_cloud/ai_machine_learning/endpoints_guide_06_function_calling/guide.en-gb.md

Lines changed: 45 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: AI Endpoints - Function Calling
33
excerpt: Learn how to use Function Calling with OVHcloud AI Endpoints
4-
updated: 2025-08-01
4+
updated: 2025-08-06
55
---
66

77
> [!primary]
@@ -15,8 +15,7 @@ updated: 2025-08-01
1515

1616
**Function Calling**, also known as tool calling, is a feature that enables a large language model (LLM) to trigger user-defined functions (also named tools). These tools are defined by the developer and implement specific behaviors such as calling an API, fetching data or calculating values, which extends the capabilities of the LLM.
1717

18-
The LLM will identify which tool(s) to call and the arguments to use.
19-
This feature can be used to develop assistants or agents, for instance.
18+
The LLM will identify which tool(s) to call and the arguments to use. This feature can be used to develop assistants or agents, for instance.
2019

2120
## Objective
2221

@@ -27,51 +26,55 @@ Visit our [Catalog](https://endpoints.ai.cloud.ovh.net/catalog) to find out whic
2726

2827
## Requirements
2928

30-
We use Python for the examples provided through this guide.
29+
We use Python for the examples provided in this guide.
3130

3231
Make sure you have a [Python](https://www.python.org/) environment configured, and install the [openai client](https://pypi.org/project/openai/).
32+
3333
```sh
3434
pip install openai
3535
```
3636

3737
### Authentication & rate limiting
3838

39-
All the examples provided in this guide are using the anonymous authentication which makes it simpler to use but may cause rate limiting issues.
40-
If you wish to enable authentication using your own token, simply specify your API key within the requests.
41-
Follow the following instructions in the [AI Endpoints - Getting Started](/pages/public_cloud/ai_machine_learning/endpoints_guide_01_getting_started) for more information on authentication.
39+
All the examples provided in this guide use the anonymous authentication which makes it simpler to use but may cause rate limiting issues. If you wish to enable authentication using your own token, simply specify your API key within the requests.
40+
41+
Follow the instructions in the [AI Endpoints - Getting Started](/pages/public_cloud/ai_machine_learning/endpoints_guide_01_getting_started) guide for more information on authentication.
4242

4343
## Function Calling overview
4444

4545
The workflow to use function calling is described below:
46+
4647
1. **Define tools**: tell the model what tools it can use, with a JSON schema for each tool.
4748
2. **Call the model with tools**: pass tools along with your system and user messages to the model, which will eventually generate tool calls.
4849
3. **Process tools calls**: for each tool call returned by the model, execute the actual implementation of the tool in your code.
4950
4. **Call the model with tools responses**: send a new request to the model, with the conversation updated with tool calls results.
50-
4. **Final response**: process the final generated answer, which takes the tools results into account.
51+
5. **Final response**: process the final generated answer, which takes the tools results into account.
5152

5253
This diagram illustrates the workflow:
5354

5455
![Function calling workflow](images/function_calling_workflow.png)
5556

56-
## Example: a time-tracking assistant
57+
**Example: a time-tracking assistant**
5758

5859
To illustrate the use of function calling and progressively introduce the important notions related to this feature, we are going to develop a time-tracking assistant, step-by-step.
5960

6061
The assistant will be able to:
61-
* log time spent on a task
62-
* generate a time report
6362

64-
Each task has a name, category and total duration in minutes. Categories are a fixed list of strings, for example "Code" or "Meetings".
65-
A time report can be generated for a category of task.
63+
* log time spent on a task
64+
* generate a time report
65+
66+
Each task has a name, category and total duration in minutes. Categories are a fixed list of strings, for example "Code" or "Meetings". A time report can be generated for a category of task.
6667

6768
The user will be able to interact with the assistant to log time and get information about how time was spent.
6869

6970
### Define tools
7071

71-
Our time-tracking assistant will use two tools :
72+
Our time-tracking assistant will use two tools:
73+
7274
* `log_work`: log time spent on a task. Take the name of the task, category, duration and unit (minutes or hours).
7375
For example, to log 2 hours on documentation writing, you would call `log_work("User guide", "Documentation", 2, "hours")`
74-
* `time_report`: get JSON data about all tasks of a given category, and the total duration, in a given time unit (minutes or hours).
76+
77+
* `time_report`: get JSON data about all tasks of a given category, and the total duration in a given time unit (minutes or hours).
7578
For example, to get the breakdown on time spent on coding tasks, in hours, you would call `time_report("Code", "hours")`
7679

7780
To get the model to use those tools, first we have to declare them with JSON schemas, in a `tools` list that we will pass to the Chat Completion API.
@@ -143,8 +146,7 @@ TOOLS = [
143146

144147
### Generate tool calls
145148

146-
With our tools ready, we can now try to call the model and see if it understands our tools definition.
147-
We use the OpenAI Python SDK to call the ``/v1/chat/completions`` route on the endpoint, passing the tools definition in the `tools` parameter.
149+
With our tools ready, we can now try to call the model and see if it understands our tools definition. We use the OpenAI Python SDK to call the ``/v1/chat/completions`` route on the endpoint, passing the tools definition in the `tools` parameter.
148150

149151
Let's send a simple user message: `log 1 hour team meeting` and see what the model answers.
150152

@@ -181,6 +183,7 @@ print(assistant_response.to_json())
181183
```
182184

183185
Output:
186+
184187
```json
185188
{
186189
"role": "assistant",
@@ -202,6 +205,7 @@ We see that the model correctly identified that it needed to call the `log_work`
202205
The `tool_calls` list contains the tool calls the model generated in response to our user message.
203206
The `name` and `arguments` fields specify which tool to call and which parameters to pass to the function.
204207
The `id` is a unique identifier for this tool call, that we will need later on.
208+
205209
You can have multiple tool calls in this list.
206210

207211
Under the hood, the model has recognized that the user's intent was related to the set of tools provided, and generated a sequence of specific tokens that were post-processed to create a tool call object.
@@ -210,9 +214,9 @@ We add this message to the conversation so that the model is aware of this tool
210214

211215
### Process tools calls
212216

213-
Now that we see that the model is able to generate tool calls, we need to code the Python implementation of the tools, so that we can process the tool calls the LLM will generate and actually start to log time!
214-
Each task is stored in a dict, with the name as the key.
215-
Categories are a fixed list.
217+
Now that we see that the model is able to generate tool calls, we need to code the Python implementation of the tools, so that we can process the tool calls generated by the LLM and start recording the time!
218+
219+
Each task is stored in a dict, with the name as the key and categories are a fixed list.
216220

217221
We define the two functions, `log_work` and `time_report`, in the Python code below:
218222

@@ -279,8 +283,7 @@ def to_minutes(duration: float, unit: str) -> float:
279283
raise ValueError("Invalid unit. Must be 'minutes', or 'hours'.")
280284
```
281285

282-
Now, let's see how we can process tool calls generated by the model.
283-
286+
Now, let's see how we can process tool calls generated by the model:
284287

285288
```python
286289
# this map allows us to know which Python function for a given tool
@@ -307,6 +310,7 @@ if assistant_response.tool_calls:
307310
```
308311

309312
Output:
313+
310314
```
311315
< 1 tool(s) to call
312316
> Execute tool log_work with arguments {'task_name': 'team meeting', 'task_category': 'Meetings', 'duration': 1, 'unit': 'hours'}
@@ -317,9 +321,10 @@ We see that we successfully created a task called "team meeting", in the "Meetin
317321

318322
### Send tool calls results and get the final response
319323

320-
Now that we have executed our tool calls, we have to send the result back to the model, so that it can generate a new response that takes this new information into account, to tell the user the task has been created successfully or to give the time report for instance.
324+
Now that we have executed our tool calls, we need to send the result back to the model so that it can generate a new response taking this new information into account, for example to inform the user that the task has been successfully created or to provide the hourly report.
321325

322326
All we have to do, is add the tool results as new `tool` messages into the conversation, so we'll update our code:
327+
323328
```python
324329
if assistant_response.tool_calls:
325330
print(f"<\t{len(assistant_response.tool_calls)} tool(s) to call")
@@ -360,6 +365,7 @@ print(f"<\t\tAssistant final answer:\n{response.choices[0].message.content}")
360365
```
361366

362367
Output:
368+
363369
```
364370
< 1 tool(s) to call
365371
> Execute tool log_work with arguments {'task_name': 'team meeting', 'task_category': 'Meetings', 'duration': 1, 'unit': 'hours'}
@@ -378,8 +384,9 @@ The assistant has generated a response acknowledging the creation of the task.
378384
### Add a system prompt
379385

380386
To make our assistant more robust and powerful, it can be useful to add a system prompt that:
381-
* explains what is expected from the model
382-
* provides useful information to the model, such as the current existing tasks and categories
387+
388+
* explains what is expected from the model.
389+
* provides useful information to the model, such as the current existing tasks and categories.
383390

384391
```python
385392
SYSTEM_PROMPT = \
@@ -409,10 +416,11 @@ With this system prompt, the model will be able to use data about existing tasks
409416
### Putting it all together
410417

411418
Now we can combine all notions we've seen so far to create a `query` method that will:
412-
* call the model with the formatted system prompt and user message
413-
* process tool calls
414-
* call the model a second time with the tool results
415-
* output the final answer
419+
420+
* call the model with the formatted system prompt and user message
421+
* process tool calls
422+
* call the model a second time with the tool results
423+
* output the final answer
416424

417425
```python
418426
def query(user_prompt: str):
@@ -486,6 +494,7 @@ query("on which task did I spent most hours coding?")
486494
```
487495

488496
Output:
497+
489498
```
490499
> Querying assistant with user prompt: Spent 2 hours coding on Feature A
491500
< 1 tool(s) to call
@@ -585,6 +594,7 @@ Based on the tracking data, the task on which you spent the most hours coding is
585594
Mission accomplished!
586595

587596
## Tips and best practices
597+
588598
This section contains additional tips that may improve the performance of Function Calling queries.
589599

590600
### Tool choice
@@ -605,6 +615,7 @@ Here are the available values for this parameter and the impact on the output.
605615
It is possible to use Function Calling in streaming mode, by setting `stream` to `true` in your request.
606616

607617
Let's see an example with cURL and the LLaMa 3.1 8B model:
618+
608619
```bash
609620
curl -X 'POST'
610621
'https://llama-3-1-8b-instruct.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1/chat/completions'
@@ -661,6 +672,7 @@ curl -X 'POST'
661672
```
662673

663674
You will get tool call deltas in the server-side events chunks, with this format:
675+
664676
```
665677
data: {...,"choices":[{"index":0,"delta":{"role":"assistant","content":""}}],...,"object":"chat.completion.chunk"}
666678
data: {...,"choices":[{"index":0,"delta":{"role":"assistant","tool_calls":[{"index":0,"id":"chatcmpl-tool-e41bfee4ae1346bbbb4336061037e2b5","type":"function","function":{"name":"get_current_weather","arguments":""}}]}}],...,"object":"chat.completion.chunk"}
@@ -699,19 +711,18 @@ final_tool_calls = [v for (k, v) in sorted(final_tool_calls_dict.items())]
699711

700712
### Parallel tool calls
701713

702-
Some models are able to generate multiple tool calls in one round (see the time-tracking tutorial above for an example).
703-
To control this behavior, the OpenAI specification allows to pass a `parallel_tool_calls` boolean parameter.
714+
Some models are able to generate multiple tool calls in one round (see the time-tracking tutorial above for an example). To control this behavior, the OpenAI specification allows to pass a `parallel_tool_calls` boolean parameter.
704715

705-
If `false`, the model can only generate one tool call at most.
706-
This case is currently not supported by AI Endpoints.
716+
If `false`, the model can only generate one tool call at most. This case is currently not supported by AI Endpoints.
707717

708718
If you need your system to process only one tool call at a time, or if the model you are using doesn't support multiple tool calls, we suggest you pick the first one, process it, and call the model again.
709719

710-
Please note that LLaMa models don't support multiple tool calls between users and assistants messages.
720+
Please note that LLaMa models do not support multiple tool calls between users and assistants messages.
711721

712722
### Prompting & additional parameters
713723

714724
Some additional considerations regarding prompts and model parameters:
725+
715726
- Most models tend to perform better when using lower temperature for function calling.
716727
- The use of a system prompt is recommended, to ground the model into using the tools at its disposal. Whether a system prompt is defined or not, a description of the tools will usually be included in the tokens sent to the model (see the model chat template for more details).
717728
- If you know in advance that your model needs to call tools, use the `tool_choice=required` parameter to make sure it generates at least one tool call.
@@ -735,4 +746,3 @@ If you need training or technical assistance to implement our solutions, contact
735746
Please send us your questions, feedback and suggestions to improve the service:
736747

737748
- On the OVHcloud [Discord server](https://discord.gg/ovhcloud)
738-

0 commit comments

Comments
 (0)