bug: tool call sometimes not parsed correctly?

### Issue description

Sometimes the <tool_call>...</tool_call> stuff just gets returned in the response instead of the tool call actually being executed (using Qwen3-14B).

### Expected Behavior

Tool call is executed.

### Actual Behavior

Tool call is just returned in the response like so:
```
<tool_call>
{"name": "createTicket", "arguments": {"subject": "...", "description": "...", "assignedTo": "...."}}
</tool_call>
```

### Steps to reproduce

```ts
import { defineChatSessionFunction, getLlama, LlamaChatSession, resolveModelFile } from 'node-llama-cpp';
import fs from 'node:fs/promises';

const modelName = 'Qwen3-14B';
const modelPath = await resolveModelFile(`hf:unsloth/${modelName}-GGUF:Q4_K_M`);
const llama = await getLlama();
const model = await llama.loadModel({ modelPath });
const context = await model.createContext();
const systemPrompt = await fs.readFile('./systemPrompt.txt', 'utf8');
const session = new LlamaChatSession({ contextSequence: context.getSequence(), systemPrompt });

const createTicket = defineChatSessionFunction({
  description: 'Creates a new ticket in the project management system',
  params: {
    type: 'object',
    properties: {
      subject: {
        type: 'string',
        description: 'Clear and descriptive subject for the ticket'
      },
      description: {
        type: 'string',
        description: 'Detailed description of what needs to be done'
      },
      assignedTo: {
        enum: ['tech', 'design'],
        description: 'Team to assign the ticket to (tech or design)'
      }
    },
    required: ['subject', 'description', 'assignedTo']
  },
  handler: async params => {
    // Creates the ticket
  }
});

const userInstructions = await fs.readFile('./userInstructions.txt', 'utf8');
const response = await session.prompt(userInstructions, { functions: { createTicket } });
console.log('AI Response:', response);
```

### My Environment

| Dependency               | Version             |
| ---                      | ---                 |
| Operating System         |     macOS  24.4.0                |
| CPU                      | Apple M2 Max |
| Node.js version          | 22.13.1             |
| Typescript version       | TS not specifically installed, using node's `--experimental-strip-types`             |
| `node-llama-cpp` version | 3.9.0             |

`npx --yes node-llama-cpp inspect gpu` output:
```
OS: macOS 24.4.0 (arm64)
Node: 22.13.1 (arm64)
node-llama-cpp: 3.9.0

Metal: available

Metal device: Apple M2 Max
Metal used VRAM: 0% (64KB/21.33GB)
Metal free VRAM: 99.99% (21.33GB/21.33GB)
Metal unified memory: 21.33GB (100%)

CPU model: Apple M2 Max
Math cores: 8
Used RAM: 64.54% (20.65GB/32GB)
Free RAM: 35.45% (11.35GB/32GB)
Used swap: 0% (0B/0B)
Max swap size: dynamic
mmap: supported
```


### Additional Context

I believe this has to with when the `<tool_call>` instruction is right at the start of the response. It looks like for subsequest tool calls it's working.

### Relevant Features Used

- [x] Metal support
- [ ] CUDA support
- [ ] Vulkan support
- [ ] Grammar
- [x] Function calling

### Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start. I would need guidance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bug: tool call sometimes not parsed correctly? #471

Issue description

Expected Behavior

Actual Behavior

Steps to reproduce

My Environment

Additional Context

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dependency	Version
Operating System	macOS 24.4.0
CPU	Apple M2 Max
Node.js version	22.13.1
Typescript version	TS not specifically installed, using node's `--experimental-strip-types`
`node-llama-cpp` version	3.9.0

Uh oh!

bug: tool call sometimes not parsed correctly? #471

Description

Issue description

Expected Behavior

Actual Behavior

Steps to reproduce

My Environment

Additional Context

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions