Skip to content

Commit 2f2ad90

Browse files
authored
Add SLM Engine support function calling (#1582)
This pull request introduces significant enhancements to the SLM Engine, focusing on adding function-calling support and improving the build process for ONNX Runtime-GenAI. The most notable changes include the implementation of function-calling capabilities, updates to the build scripts to handle fallback modes, and the addition of new structures and methods to support function tools in the C++ codebase. ### Function-Calling Support: * Added a new section in `README.md` to document function-calling capabilities, including key features, an example request, and structured JSON output for function calls (`examples/slm_engine/README.md`). * Introduced new structs (`FunctionParameter`, `FunctionTool`, `FunctionCallOptions`, `FunctionCall`, and `FunctionCallResult`) to represent function tools, parameters, and results in the SLM Engine (`examples/slm_engine/src/cpp/slm_engine.h`). * Added methods to parse tools from JSON, format input with tools, create grammar for function calling, and handle function call results (`examples/slm_engine/src/cpp/slm_engine.h`). * Updated `InputDecoder` to handle tools and added a new `TOOL` role to support function-calling scenarios (`examples/slm_engine/src/cpp/input_decoder.cpp`, `examples/slm_engine/src/cpp/input_decoder.h`). ### Build Process Improvements: * Enhanced the build process for ONNX Runtime-GenAI to support guidance mode for function calling and added a fallback mechanism if guidance mode fails (`examples/slm_engine/build_scripts/build_deps.py`). * Updated build scripts to include descriptive logging for function-calling support (`examples/slm_engine/build_scripts/build_deps.py`). ### Server Capabilities Update: * Modified the SLM server to include "function_calling" as a capability in its response, reflecting the new feature (`examples/slm_engine/src/cpp/slm_server.cpp`). These changes collectively enhance the SLM Engine's functionality, making it more adaptable for use cases requiring intelligent function invocation and robust build processes.
1 parent 7adf3b7 commit 2f2ad90

File tree

13 files changed

+1552
-81
lines changed

13 files changed

+1552
-81
lines changed

examples/slm_engine/README.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,84 @@ The SLM server supports the following REST APIs (click to expand):
320320

321321
</details>
322322

323+
### Function Calling Support
324+
325+
The SLM Engine enabling the model to intelligently select and invoke predefined functions based on user requests. This feature allows developers to extend the model's capabilities by providing custom tools and functions that the AI can use to perform specific tasks.
326+
327+
#### Key Features:
328+
- **Tool Definition**: Define custom functions with parameters and descriptions
329+
- **Intelligent Function Selection**: The model automatically determines which function to call based on user input
330+
- **Structured Output**: Returns function calls in a standardized JSON format
331+
332+
#### Example Function Calling Request
333+
334+
The following example demonstrates how to use function calling with the SLM Engine for booking flights and hotels:
335+
336+
```bash
337+
curl -X POST http://localhost:8000/completions -H "Content-Type: application/json" --data '{
338+
"messages": [
339+
{
340+
"role": "system",
341+
"content": "You are a helpful assistant with these tools."
342+
},
343+
{
344+
"role": "user",
345+
"content": "book flight ticket from Beijing to Paris(using airport code) in 2025-12-04 to 2025-12-10 , then book hotel from 2025-12-04 to 2025-12-10 in Paris"
346+
}
347+
],
348+
"tools": [
349+
{
350+
"name": "booking_flight_tickets",
351+
"description": "booking flights",
352+
"parameters": {
353+
"origin_airport_code": {
354+
"description": "The name of Departure airport code",
355+
"type": "string"
356+
},
357+
"destination_airport_code": {
358+
"description": "The name of Destination airport code",
359+
"type": "string"
360+
},
361+
"departure_date": {
362+
"description": "The date of outbound flight",
363+
"type": "string"
364+
},
365+
"return_date": {
366+
"description": "The date of return flight",
367+
"type": "string"
368+
}
369+
}
370+
},
371+
{
372+
"name": "booking_hotels",
373+
"description": "booking hotel",
374+
"parameters": {
375+
"destination": {
376+
"description": "The name of the city",
377+
"type": "string"
378+
},
379+
"check_in_date": {
380+
"description": "The date of check in",
381+
"type": "string"
382+
},
383+
"checkout_date": {
384+
"description": "The date of check out",
385+
"type": "string"
386+
}
387+
}
388+
}
389+
],
390+
"temperature": 0.00001,
391+
"max_tokens": 4096,
392+
"top_p": 1.0,
393+
"do_sample": false
394+
}'
395+
```
396+
397+
The model will analyze the user's request and generate appropriate function calls with the correct parameters, enabling seamless integration with external APIs and services.
398+
399+
***Note*** - This time we just support Phi and Llama,Qwen3 model
400+
323401
### C++ Application using the SLMEngine
324402

325403
The SLMEngine is designed to be used from another C++ application running on the Edge. Integrating the SLMEngine into another C++ project using cmake is illustrated below.

examples/slm_engine/build_scripts/build.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ def main():
7070
cmake_generator,
7171
TOPLEVEL_DIR,
7272
f"-DARTIFACTS_DIR={artifacts_dir}",
73-
f"-DCMAKE_BUILD_TYPE={args.build_type}",
73+
f"-DCMAKE_BUILD_TYPE={args.build_type}"
7474
]
7575

7676
# We keep the build directory prefix as same as that's returned by the

examples/slm_engine/build_scripts/build_deps.py

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,7 @@ def build_ort(args, build_dir, artifacts_dir):
212212
"--parallel",
213213
"--config",
214214
args.build_type,
215+
# "--use_guidance",
215216
]
216217
if args.android:
217218
cmd_args.extend(
@@ -330,7 +331,7 @@ def build_ort_genai(args, artifacts_dir, ort_home):
330331
raise Exception("Failed to update submodules")
331332

332333
# Now build the ORT-GenAI library
333-
print(f"{MAGENTA}Building ONNX Runtime-GenAI{CLEAR}")
334+
print(f"{MAGENTA}Building ONNX Runtime-GenAI with Guidance Support for Function Calling{CLEAR}")
334335
# Prepare the command arguments
335336
cmd_args = [
336337
"--skip_wheel",
@@ -340,6 +341,10 @@ def build_ort_genai(args, artifacts_dir, ort_home):
340341
args.build_type,
341342
"--cmake_extra_defines",
342343
"ENABLE_PYTHON=OFF",
344+
# "USE_GUIDANCE=ON",
345+
# "--use_guidance", # Enable guidance support for constrained JSON generation
346+
# Note: If Python linking issues occur, comment out --use_guidance above
347+
# Function calling will work in both guidance and fallback modes
343348
]
344349
if ort_home is None:
345350
raise Exception(
@@ -368,7 +373,20 @@ def build_ort_genai(args, artifacts_dir, ort_home):
368373
python_executable = sys.executable
369374
result = subprocess.call([python_executable, "build.py"] + cmd_args)
370375
if result != 0:
371-
raise Exception(f"{RED}Failed to build ORT-GenAI{CLEAR}")
376+
# If guidance build fails, try fallback mode
377+
print(f"{RED}Guidance build failed. Attempting fallback mode without guidance...{CLEAR}")
378+
# Remove --use_guidance from cmd_args
379+
if "--use_guidance" in cmd_args:
380+
cmd_args.remove("--use_guidance")
381+
382+
print(f"{MAGENTA}Running build.py with fallback args: {cmd_args}{CLEAR}")
383+
result = subprocess.call([python_executable, "build.py"] + cmd_args)
384+
if result != 0:
385+
raise Exception(f"{RED}Failed to build ORT-GenAI in both guidance and fallback modes{CLEAR}")
386+
else:
387+
print(f"{MAGENTA}Successfully built ORT-GenAI in fallback mode{CLEAR}")
388+
else:
389+
print(f"{MAGENTA}Successfully built ORT-GenAI with guidance support{CLEAR}")
372390

373391
# Now install the ORT-GenAI library
374392
build_dir_name = f"build/{get_platform_dirname(args)}/{args.build_type}"
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.0.0
1+
3.0.0

examples/slm_engine/src/cpp/input_decoder.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,12 @@ class OpenAIInputDecoder : public InputDecoder {
9090
<< CLEAR << endl;
9191
}
9292
}
93+
94+
// Handle tools parameter for function calling
95+
if (json_msg.contains("tools")) {
96+
decoded_params.ToolsJson = json_msg["tools"].dump();
97+
decoded_params.HasTools = true;
98+
}
9399
} catch (json::parse_error& err) {
94100
cout << RED << "Error in JSON At: " << err.what() << CLEAR << endl;
95101
return false;

examples/slm_engine/src/cpp/input_decoder.h

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,17 @@ class InputDecoder {
2424
struct InputParams {
2525
enum class Role { SYSTEM,
2626
USER,
27-
ASSISTANT };
27+
ASSISTANT,
28+
TOOL };
2829

2930
// Utility function to convert string to Role
3031
static Role ToRole(const std::string& role) {
3132
if (role == "system") {
3233
return Role::SYSTEM;
3334
} else if (role == "user") {
3435
return Role::USER;
36+
} else if (role == "tool") {
37+
return Role::TOOL;
3538
} else {
3639
return Role::ASSISTANT;
3740
}
@@ -51,11 +54,16 @@ class InputDecoder {
5154
float TopP;
5255
uint32_t TopK;
5356

57+
// Function calling support
58+
std::string ToolsJson; // Raw tools JSON string from input
59+
bool HasTools;
60+
5461
explicit InputParams() {
5562
MaxGeneratedTokens = 512;
5663
Temperature = 0.00000000000001f;
5764
TopK = 50;
5865
TopP = 1.0f;
66+
HasTools = false;
5967
}
6068

6169
std::string get_messages() {
@@ -68,6 +76,9 @@ class InputDecoder {
6876
case Role::USER:
6977
output << "{\"role\": \"user\", ";
7078
break;
79+
case Role::TOOL:
80+
output << "{\"role\": \"tool\", ";
81+
break;
7182
case Role::ASSISTANT:
7283
output << "{\"role\": \"assistant\", ";
7384
break;
@@ -89,6 +100,9 @@ class InputDecoder {
89100
case Role::USER:
90101
output << "USER";
91102
break;
103+
case Role::TOOL:
104+
output << "TOOL";
105+
break;
92106
case Role::ASSISTANT:
93107
output << "ASSISTANT";
94108
break;

0 commit comments

Comments
 (0)