-
-
Notifications
You must be signed in to change notification settings - Fork 154
Labels
bugSomething isn't workingSomething isn't workinggood first issueGood for newcomersGood for newcomersreleased
Description
Issue description
Sometimes the <tool_call>...</tool_call> stuff just gets returned in the response instead of the tool call actually being executed (using Qwen3-14B).
Expected Behavior
Tool call is executed.
Actual Behavior
Tool call is just returned in the response like so:
<tool_call>
{"name": "createTicket", "arguments": {"subject": "...", "description": "...", "assignedTo": "...."}}
</tool_call>
Steps to reproduce
import { defineChatSessionFunction, getLlama, LlamaChatSession, resolveModelFile } from 'node-llama-cpp';
import fs from 'node:fs/promises';
const modelName = 'Qwen3-14B';
const modelPath = await resolveModelFile(`hf:unsloth/${modelName}-GGUF:Q4_K_M`);
const llama = await getLlama();
const model = await llama.loadModel({ modelPath });
const context = await model.createContext();
const systemPrompt = await fs.readFile('./systemPrompt.txt', 'utf8');
const session = new LlamaChatSession({ contextSequence: context.getSequence(), systemPrompt });
const createTicket = defineChatSessionFunction({
description: 'Creates a new ticket in the project management system',
params: {
type: 'object',
properties: {
subject: {
type: 'string',
description: 'Clear and descriptive subject for the ticket'
},
description: {
type: 'string',
description: 'Detailed description of what needs to be done'
},
assignedTo: {
enum: ['tech', 'design'],
description: 'Team to assign the ticket to (tech or design)'
}
},
required: ['subject', 'description', 'assignedTo']
},
handler: async params => {
// Creates the ticket
}
});
const userInstructions = await fs.readFile('./userInstructions.txt', 'utf8');
const response = await session.prompt(userInstructions, { functions: { createTicket } });
console.log('AI Response:', response);My Environment
| Dependency | Version |
|---|---|
| Operating System | macOS 24.4.0 |
| CPU | Apple M2 Max |
| Node.js version | 22.13.1 |
| Typescript version | TS not specifically installed, using node's --experimental-strip-types |
node-llama-cpp version |
3.9.0 |
npx --yes node-llama-cpp inspect gpu output:
OS: macOS 24.4.0 (arm64)
Node: 22.13.1 (arm64)
node-llama-cpp: 3.9.0
Metal: available
Metal device: Apple M2 Max
Metal used VRAM: 0% (64KB/21.33GB)
Metal free VRAM: 99.99% (21.33GB/21.33GB)
Metal unified memory: 21.33GB (100%)
CPU model: Apple M2 Max
Math cores: 8
Used RAM: 64.54% (20.65GB/32GB)
Free RAM: 35.45% (11.35GB/32GB)
Used swap: 0% (0B/0B)
Max swap size: dynamic
mmap: supported
Additional Context
I believe this has to with when the <tool_call> instruction is right at the start of the response. It looks like for subsequest tool calls it's working.
Relevant Features Used
- Metal support
- CUDA support
- Vulkan support
- Grammar
- Function calling
Are you willing to resolve this issue by submitting a Pull Request?
Yes, I have the time, but I don't know how to start. I would need guidance.
DenysVuika
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinggood first issueGood for newcomersGood for newcomersreleased