Skip to content

Commit e484944

Browse files
authored
Deepseek R1 function calls (more formats) (#652)
* Implement function calling / tools for ik_llama.cpp for Kimi K2 * Implement basic tool choice * Backport llama.cpp tool calls support * Enhance function calls with improved chat parser and string utilities - Add new chat.h/chat.cpp and chat-parser.h/chat-parser.cpp for better chat handling - Improve function calls parsing with fallback to llama.cpp builder pattern - Add string utility functions (starts_with, ends_with, find_partial_stop) - Update README with function calls testing instructions - Enhance Kimi K2 parser and function calls documentation - Add comprehensive test suite for function calls - Update CMakeLists.txt and Makefile for new components * Enhance function calling with unified streaming and parser improvements - Fix streaming content cleanup to prevent function syntax in output - Unify content extraction patterns with llama.cpp approach - Improve Kimi K2 parser robustness and partial content handling - Add comprehensive test coverage for function call scenarios - Optimize chat message parsing and diff computation * Replace hardcoded values in kimi_k2_parser.hpp with named constants - Add compile-time constants for all token format markers - Add compile-time constants for XML format markers - Add compile-time constants for simple format patterns - Replace all hardcoded string literals with named constants - Use compile-time length calculation to avoid manual counting - Improve maintainability and reduce magic numbers throughout parser * Fix duplicate common_chat_parse definition - Remove duplicate implementation from chat-parser.cpp - Keep single implementation in chat.cpp following llama.cpp patterns - Resolves linker error: multiple definition of common_chat_parse * Fix JSON assertion failure in function call parsing - Add proper validation that 'function' field is an object before accessing nested keys - Handle missing 'arguments' field gracefully with default "{}" - Prevents crash when parsing malformed tool call JSON structures * Add comprehensive Qwen3 XML tool calling support with unit tests - Implement Qwen3 XML parser with <tool_call>{"name": "func", "arguments": {...}}</tool_call> format - Add model detection and routing for Qwen3 vs Kimi-K2 formats - Create 8 comprehensive unit tests covering parsing, streaming, error handling - Fix token format cleaning bug in kimi_k2_parser.hpp processing order - Remove progressive parsing code and related utilities - Add tool injection support for Qwen3 format in server utils * Add DeepSeek R1 function calling support with comprehensive unit tests - Implement complete DeepSeek R1 tool call parsing in common_chat_parser.cpp - Add DeepSeek R1 model detection and tool injection in deepseek_r1_tools.hpp - Update function_calls.hpp with DeepSeek R1 integration and content extraction - Update documentation to reflect support for Kimi-K2, Qwen3, and DeepSeek R1 models - Add comprehensive unit tests for DeepSeek R1 reasoning, tool calls, and integration - Port exact implementation patterns from original llama.cpp for compatibility Key features: - Native DeepSeek R1 format: <|tool▁calls▁begin|>function<|tool▁sep|>name```json{}```<|tool▁call▁end|><|tool▁calls▁end|> - Reasoning content extraction from <think>...</think> tags - Multiple tool calls support with separate call blocks - Model detection for deepseek-r1, deepseek_r1 naming patterns - Integration with incremental parsing and streaming support * Add partial parsing support for JSON and regex - json-partial.h/cpp: JSON partial parsing functionality - regex-partial.h/cpp: Regex partial parsing functionality * Add format_chat integration tests for Qwen3 tool injection - Add test_qwen3_format_chat_integration() to validate tool injection pipeline - Test tool injection conditions and system message enhancement - Verify JSON formatting and anti-preamble instructions - Add comprehensive test documentation Tests confirm tool injection works correctly - conversational preamble issue is not in ik_llama.cpp but likely in UI configuration. * Fix Qwen3 tool call parsing - pass model name to parser Server was not passing model name to parse_chat_message_incremental(), causing Qwen3 to fall back to Kimi-K2 parser and return tool calls as content instead of proper tool_calls array. * Fix non-streaming path to use model-specific parsing Non-streaming responses were hardcoded to use Kimi-K2 format, causing Qwen3 XML tool calls to be returned as content instead of proper tool_calls array. Now uses same model detection as streaming path for consistency. * Update Qwen3 function call handling in server and tests - Enhanced server function call detection and response formatting - Improved test coverage for Qwen3 tool call scenarios - Refined XML parsing for better tool execution support * Add DeepSeek-R1 function call parsing support Implements comprehensive parsing for all 4 DeepSeek-R1 function call formats: - Format 1: Standard function call syntax (already supported) - Format 2: Alternative function call patterns (already supported) - Format 3: Tools array format - function\n```json\n{"tools": [...]} - Format 4: XML wrapped format - <tool_call>function</think>Name\n```json\n{...}```</tool_call> Key changes: - Added parse_deepseek_r1_tools_array() following original parse_prefixed_json_tool_call_array pattern - Added parse_deepseek_r1_xml_wrapped() following Hermes-2-Pro XML wrapper patterns - Integrated both parsers into exception handling chain for robust fallback - Added comprehensive TDD test coverage for all formats - Anonymized all confidential information while preserving functionality Resolves tool_calls_count=0 issue where DeepSeek-R1 models generated valid tool calls but server failed to parse them correctly. * Update function_calls.md documentation for DeepSeek-R1 Format 4 - Added Format 4 (XML wrapped) documentation with examples - Updated implementation notes with correct parser order (3→4→1→2) - Marked all DeepSeek-R1 formats as working (July 2025 update) - Updated test status for Format 3 and 4 as passing - Added parse_deepseek_r1_xml_wrapped() function reference - Corrected implementation file line numbers * Fix merge conflict in test-function-calls.cpp - Removed incomplete merge conflict marker from line 3027 - Ensured all tests compile and pass successfully - All DeepSeek-R1 formats (1-4) working correctly - All streaming and content cleaning tests passing
1 parent 47c3dc7 commit e484944

File tree

7 files changed

+694
-124
lines changed

7 files changed

+694
-124
lines changed

common/chat-parser.cpp

Lines changed: 3 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -208,90 +208,11 @@ void common_chat_msg_parser::parse_generic_format() {
208208
}
209209

210210
void common_chat_msg_parser::parse_deepseek_r1_format() {
211-
// DeepSeek R1 format supports <think> tags for reasoning content
212-
try_parse_reasoning("<think>", "</think>");
213-
214-
if (!syntax_.enable_tool_calls) {
215-
add_content(consume_rest());
216-
return;
217-
}
218-
219-
// DeepSeek R1 tool call patterns from original llama.cpp
220-
static const common_regex tool_calls_begin("(?:<|tool▁calls▁begin|>|<|tool_calls_begin|>|<|tool calls begin|>|<|tool\\\\_calls\\\\_begin|>|<|tool▁calls|>)");
221-
static const common_regex tool_calls_end("<|tool▁calls▁end|>");
222-
static const common_regex function_regex("(?:<|tool▁call▁begin|>)?function<|tool▁sep|>([^\n]+)\n```json\n");
223-
static const common_regex close_regex("```[\\s\\r\\n]*<|tool▁call▁end|>");
224-
225-
parse_deepseek_r1_tool_calls(tool_calls_begin, function_regex, close_regex, tool_calls_end);
211+
// Delegate to the main chat.cpp function which has the corrected implementation
212+
// This follows the original llama.cpp pattern where chat-parser delegates to chat.cpp
213+
common_chat_parse_deepseek_r1(*this);
226214
}
227215

228-
void common_chat_msg_parser::parse_deepseek_r1_tool_calls(
229-
const common_regex & tool_calls_begin,
230-
const common_regex & function_regex,
231-
const common_regex & close_regex,
232-
const common_regex & tool_calls_end) {
233-
234-
// Helper function to wrap code as JSON arguments (ported from original llama.cpp)
235-
auto wrap_code_as_arguments = [this](const std::string & code) -> std::string {
236-
std::string arguments;
237-
if (is_partial_) {
238-
arguments = (json {{"code", code + healing_marker_}}).dump();
239-
auto idx = arguments.find(healing_marker_);
240-
if (idx != std::string::npos) {
241-
arguments.resize(idx);
242-
}
243-
} else {
244-
arguments = (json {{"code", code}}).dump();
245-
}
246-
return arguments;
247-
};
248-
249-
auto parse_tool_calls = [&]() {
250-
size_t from = std::string::npos;
251-
while (true) {
252-
auto res = try_find_regex(function_regex, from);
253-
if (res) {
254-
// Extract function name from regex group 1
255-
std::string name = str(res->groups[1]);
256-
from = std::string::npos;
257-
258-
if (name.empty()) {
259-
from = res->groups[0].begin + 1;
260-
continue;
261-
}
262-
263-
auto maybe_raw_python = name == "python";
264-
if (input_[pos_] == '{' || !maybe_raw_python) {
265-
if (auto arguments = try_consume_json_with_dumped_args({{}})) {
266-
if (!add_tool_call(name, "", arguments->value) || arguments->is_partial) {
267-
throw common_chat_msg_partial_exception("incomplete tool call");
268-
}
269-
try_consume_regex(close_regex);
270-
}
271-
continue;
272-
}
273-
if (maybe_raw_python) {
274-
auto arguments = wrap_code_as_arguments(consume_rest());
275-
if (!add_tool_call(name, "", arguments)) {
276-
throw common_chat_msg_partial_exception("incomplete tool call");
277-
}
278-
return;
279-
}
280-
throw common_chat_msg_partial_exception("incomplete tool call");
281-
}
282-
break;
283-
}
284-
try_consume_regex(tool_calls_end);
285-
consume_spaces();
286-
add_content(consume_rest());
287-
};
288-
289-
if (auto res = try_find_regex(tool_calls_begin)) {
290-
parse_tool_calls();
291-
} else {
292-
add_content(consume_rest());
293-
}
294-
}
295216

296217
void common_chat_msg_parser::finish() {
297218
// Any final processing can go here

common/chat-parser.h

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -113,13 +113,6 @@ class common_chat_msg_parser {
113113
void parse_deepseek_r1_format();
114114
void parse_generic_format();
115115

116-
// DeepSeek R1 specific tool call parsing
117-
void parse_deepseek_r1_tool_calls(
118-
const common_regex & tool_calls_begin,
119-
const common_regex & function_regex,
120-
const common_regex & close_regex,
121-
const common_regex & tool_calls_end);
122-
123116

124117
// JSON parsing utilities (enhanced streaming support)
125118
struct json_parse_result {

common/chat.cpp

Lines changed: 244 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,103 @@ static void common_chat_parse_generic(common_chat_msg_parser & builder) {
104104
}
105105
}
106106

107-
static void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder) {
107+
// Helper function from original llama.cpp
108+
static std::string wrap_code_as_arguments(common_chat_msg_parser & builder, const std::string & code) {
109+
std::string arguments;
110+
if (builder.is_partial()) {
111+
arguments = (json {{"code", code + builder.healing_marker()}}).dump();
112+
auto idx = arguments.find(builder.healing_marker());
113+
if (idx != std::string::npos) {
114+
arguments.resize(idx);
115+
}
116+
} else {
117+
arguments = (json {{"code", code}}).dump();
118+
}
119+
return arguments;
120+
}
121+
122+
// Forward declaration
123+
static void parse_deepseek_r1_tools_array(common_chat_msg_parser & builder);
124+
static void parse_deepseek_r1_xml_wrapped(common_chat_msg_parser & builder);
125+
126+
// Helper function from original llama.cpp for parsing JSON tool calls
127+
static void parse_json_tool_calls(
128+
common_chat_msg_parser & builder,
129+
const std::optional<common_regex> & block_open,
130+
const std::optional<common_regex> & function_regex_start_only,
131+
const std::optional<common_regex> & function_regex,
132+
const common_regex & close_regex,
133+
const std::optional<common_regex> & block_close,
134+
bool allow_raw_python = false,
135+
const std::function<std::string(const common_chat_msg_parser::find_regex_result & fres)> & get_function_name = nullptr) {
136+
137+
auto parse_tool_calls = [&]() {
138+
size_t from = std::string::npos;
139+
auto first = true;
140+
while (true) {
141+
auto res = function_regex_start_only && first
142+
? builder.try_consume_regex(*function_regex_start_only)
143+
: function_regex
144+
? builder.try_find_regex(*function_regex, from)
145+
: std::nullopt;
146+
if (res) {
147+
std::string name;
148+
if (get_function_name) {
149+
name = get_function_name(*res);
150+
} else {
151+
if (res->groups.size() < 2) {
152+
from = res->groups[0].begin + 1;
153+
continue;
154+
}
155+
name = builder.str(res->groups[1]);
156+
}
157+
first = false;
158+
if (name.empty()) {
159+
// get_function_name signalled us that we should skip this match and treat it as content.
160+
from = res->groups[0].begin + 1;
161+
continue;
162+
}
163+
from = std::string::npos;
164+
165+
auto maybe_raw_python = name == "python" && allow_raw_python;
166+
if (builder.input()[builder.pos()] == '{' || !maybe_raw_python) {
167+
if (auto arguments = builder.try_consume_json_with_dumped_args({{}})) {
168+
if (!builder.add_tool_call(name, "", arguments->value) || arguments->is_partial) {
169+
throw common_chat_msg_partial_exception("incomplete tool call");
170+
}
171+
builder.try_consume_regex(close_regex);
172+
}
173+
continue;
174+
}
175+
if (maybe_raw_python) {
176+
auto arguments = wrap_code_as_arguments(builder, builder.consume_rest());
177+
if (!builder.add_tool_call(name, "", arguments)) {
178+
throw common_chat_msg_partial_exception("incomplete tool call");
179+
}
180+
return;
181+
}
182+
throw common_chat_msg_partial_exception("incomplete tool call");
183+
}
184+
break;
185+
}
186+
if (block_close) {
187+
builder.try_consume_regex(*block_close);
188+
}
189+
builder.consume_spaces();
190+
builder.add_content(builder.consume_rest());
191+
};
192+
if (block_open) {
193+
if (auto res = builder.try_find_regex(*block_open)) {
194+
parse_tool_calls();
195+
} else {
196+
builder.add_content(builder.consume_rest());
197+
}
198+
} else {
199+
parse_tool_calls();
200+
}
201+
}
202+
203+
void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder) {
108204
builder.try_parse_reasoning("<think>", "</think>");
109205
if (!builder.syntax().enable_tool_calls) {
110206
builder.add_content(builder.consume_rest());
@@ -113,25 +209,159 @@ static void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder) {
113209

114210
static const common_regex tool_calls_begin("(?:<|tool▁calls▁begin|>|<|tool_calls_begin|>|<|tool calls begin|>|<|tool\\\\_calls\\\\_begin|>|<|tool▁calls|>)");
115211
static const common_regex tool_calls_end("<|tool▁calls▁end|>");
212+
// Primary regex for correct format with separator
116213
static const common_regex function_regex("(?:<|tool▁call▁begin|>)?function<|tool▁sep|>([^\n]+)\n```json\n");
214+
// Fallback regex for format without separator (some models generate this)
215+
static const common_regex function_regex_no_sep("(?:<|tool▁call▁begin|>)?function<([^>]+)>\n```json\n");
216+
// Third regex for new format: just "function" with no markers
217+
static const common_regex function_regex_simple("function\n```json\n");
117218
static const common_regex close_regex("```[\\s\\r\\n]*<|tool▁call▁end|>");
219+
static const common_regex close_regex_simple("```"); // For simple format without end markers
118220

119-
// Simplified tool calls parsing for DEEPSEEK_R1
120-
if (auto res = builder.try_find_regex(tool_calls_begin)) {
121-
while (auto func_res = builder.try_find_regex(function_regex)) {
122-
auto function_name = builder.str(func_res->groups[1]);
123-
auto args_json = builder.try_consume_json();
124-
if (args_json) {
125-
builder.add_tool_call(function_name, "", args_json->json.dump());
126-
builder.try_consume_regex(close_regex);
127-
} else {
128-
throw common_chat_msg_partial_exception("incomplete tool call JSON");
221+
// Check for the new tools array format first (no DeepSeek markers)
222+
auto original_pos = builder.pos();
223+
224+
// First, try the tools array format for content like "function\n```json\n{"tools": [...]}"
225+
if (builder.try_find_regex(function_regex_simple)) {
226+
builder.move_to(original_pos);
227+
try {
228+
parse_deepseek_r1_tools_array(builder);
229+
return; // Success, we're done
230+
} catch (const common_chat_msg_partial_exception&) {
231+
// Fall through to try standard DeepSeek patterns
232+
}
233+
}
234+
235+
// If tools array format didn't work, try XML-wrapped format
236+
builder.move_to(original_pos);
237+
try {
238+
parse_deepseek_r1_xml_wrapped(builder);
239+
return; // Success, we're done
240+
} catch (const common_chat_msg_partial_exception&) {
241+
// Fall through to try standard DeepSeek patterns
242+
}
243+
244+
// If XML wrapper format didn't work, try standard DeepSeek patterns
245+
builder.move_to(original_pos);
246+
try {
247+
parse_json_tool_calls(
248+
builder,
249+
/* block_open= */ tool_calls_begin,
250+
/* function_regex_start_only= */ std::nullopt,
251+
function_regex,
252+
close_regex,
253+
tool_calls_end);
254+
} catch (const common_chat_msg_partial_exception&) {
255+
// If primary regex fails and we're not in partial mode, try fallback regex
256+
if (!builder.is_partial()) {
257+
builder.move_to(original_pos);
258+
try {
259+
parse_json_tool_calls(
260+
builder,
261+
/* block_open= */ tool_calls_begin,
262+
/* function_regex_start_only= */ std::nullopt,
263+
function_regex_no_sep,
264+
close_regex,
265+
tool_calls_end);
266+
} catch (const common_chat_msg_partial_exception&) {
267+
// Try the simple format without markers as final fallback
268+
builder.move_to(original_pos);
269+
parse_json_tool_calls(
270+
builder,
271+
/* block_open= */ std::nullopt,
272+
/* function_regex_start_only= */ std::nullopt,
273+
function_regex_simple,
274+
close_regex_simple,
275+
std::nullopt);
129276
}
277+
} else {
278+
throw; // Re-throw for partial mode
130279
}
131-
builder.try_consume_regex(tool_calls_end);
132-
builder.add_content(builder.consume_rest());
280+
}
281+
}
282+
283+
// Parse DeepSeek R1 tools array format following original llama.cpp parse_prefixed_json_tool_call_array pattern
284+
static void parse_deepseek_r1_tools_array(common_chat_msg_parser & builder) {
285+
static const common_regex prefix("function\n```json\n");
286+
287+
288+
if (auto res = builder.try_find_regex(prefix)) {
289+
// Parse JSON and manually process tools array to convert arguments to strings
290+
auto json_result = builder.try_consume_json();
291+
if (!json_result) {
292+
throw common_chat_msg_partial_exception("invalid JSON");
293+
}
294+
295+
296+
// DeepSeek R1 format has "tools" array, manually process each tool
297+
if (json_result->json.contains("tools") && json_result->json.at("tools").is_array()) {
298+
299+
// Manually create tool calls array with string arguments (following original pattern)
300+
json tools_with_dumped_args = json::array();
301+
for (const auto& tool : json_result->json.at("tools")) {
302+
if (tool.contains("name") && tool.contains("arguments")) {
303+
json formatted_tool;
304+
formatted_tool["name"] = tool.at("name");
305+
// Convert arguments object to string (this is what consume_json_with_dumped_args does)
306+
formatted_tool["arguments"] = tool.at("arguments").dump();
307+
tools_with_dumped_args.push_back(formatted_tool);
308+
}
309+
}
310+
311+
312+
if (!builder.add_tool_calls(tools_with_dumped_args) || !json_result->healing_marker.marker.empty()) {
313+
throw common_chat_msg_partial_exception("incomplete tool call array");
314+
}
315+
} else {
316+
throw common_chat_msg_partial_exception("tools key not found or not array");
317+
}
318+
319+
// Consume closing ```
320+
builder.try_consume_regex(common_regex("```"));
133321
} else {
134-
builder.add_content(builder.consume_rest());
322+
throw common_chat_msg_partial_exception("function prefix not found");
323+
}
324+
}
325+
326+
// Parse DeepSeek R1 XML-wrapped format following original Hermes-2-Pro pattern
327+
static void parse_deepseek_r1_xml_wrapped(common_chat_msg_parser & builder) {
328+
329+
// Pattern for: <tool_call>\nfunction</think>FunctionName\n```json\n{...}\n```\n</tool_call>
330+
static const common_regex xml_pattern(
331+
"<tool_call>\\s*" // Opening XML tag
332+
"function</think>([^\\n]+)" // Function name after "function</think>"
333+
"\\s*```json\\s*" // JSON block start
334+
);
335+
336+
if (auto res = builder.try_find_regex(xml_pattern)) {
337+
338+
// Extract function name from capture group
339+
std::string function_name = builder.str(res->groups[1]);
340+
341+
// Parse JSON arguments
342+
auto json_result = builder.try_consume_json();
343+
if (!json_result) {
344+
throw common_chat_msg_partial_exception("invalid JSON in XML wrapper");
345+
}
346+
347+
348+
// Create single tool call following original pattern
349+
json tool_call;
350+
tool_call["name"] = function_name;
351+
tool_call["arguments"] = json_result->json.dump(); // Convert to string
352+
353+
json tool_calls_array = json::array();
354+
tool_calls_array.push_back(tool_call);
355+
356+
357+
if (!builder.add_tool_calls(tool_calls_array) || !json_result->healing_marker.marker.empty()) {
358+
throw common_chat_msg_partial_exception("incomplete XML wrapped tool call");
359+
}
360+
361+
// Consume closing ```\n</tool_call>
362+
builder.try_consume_regex(common_regex("```\\s*</tool_call>"));
363+
} else {
364+
throw common_chat_msg_partial_exception("XML wrapper pattern not found");
135365
}
136366
}
137367

common/chat.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,3 +162,6 @@ common_chat_msg common_chat_parse(const std::string & input, bool is_partial, co
162162
// Forward declare parser class
163163
class common_chat_msg_parser;
164164

165+
// Format-specific parsing functions (accessible from chat-parser)
166+
void common_chat_parse_deepseek_r1(common_chat_msg_parser & builder);
167+

0 commit comments

Comments
 (0)