Phi-4 Mini Support on Transformers.js v4

### Environment/Platform

- [x] Website/web-app
- [ ] Browser extension
- [ ] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.g., Electron)
- [ ] Other (e.g., VSCode extension)

### Description

I've found ONNX models for Phi-4-mini available at:

https://huggingface.co/onnx-community/Phi-4-mini-instruct-ONNX-MHA/
https://huggingface.co/onnx-community/Phi-4-mini-instruct-ONNX-GQA/

However, when attempting to use these models with Transformers.js, I'm encountering errors. Could you please help identify if I'm missing something? Thank you! @xenova

### Error Log:

Transformers.js v4 (self-built before offical release, [commit 125a8fd](https://github.com/huggingface/transformers.js/commit/125a8fded4a29e7b3ad9f1c2ebdb9ee569f52b50) + ORT Web WebGPU 1.24.0-dev.20251104-75d35474d5):

``` 
ort.webgpu.bundle.min.mjs:10
    Uncaught Error: Can't create a session. ERROR_CODE: 6, ERROR_MESSAGE: Data of TensorProto ( tensor name: ) should be stored in model_q4f16.onnx_data_1, but it is not regular file.
        at async g (onnx.js:153:11)
        at async models.js:359:25
        at async Promise.all (index 0)
        at async z (models.js:349:19)
        at async Promise.all (index 0)
```

#### Test Cases:

1. https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4gqa.html

Added `use_external_data_format` in JS code, but seems like the `"model_q4f16.onnx": 2` doesn't help.

```
 const generator = await pipeline(
    "text-generation",
    "onnx-community/Phi-4-mini-instruct-ONNX-GQA",
    {
      device: 'webgpu',
      dtype: 'q4f16',
      use_external_data_format: {
        "model.onnx": 1,
        "model_fp16.onnx": 1,
        "model_q4.onnx": 3,
        "model_q4f16.onnx": 2
      },
      session_options: {
        logSeverityLevel: 0
      },
    }
  );
```

2. https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4mha.html 

```
  const generator = await pipeline(
    "text-generation",
    "onnx-community/Phi-4-mini-instruct-ONNX-MHA",
    {
      device: 'webgpu',
      dtype: 'q4f16',
      session_options: {
        logSeverityLevel: 0
      },
    }
  );
```

Didn't add `use_external_data_format` in JS config, expected Transformers.js to read the config from [`config.json`](https://huggingface.co/onnx-community/Phi-4-mini-instruct-ONNX-MHA/blob/main/config.json#L143):

```
  "transformers.js_config": {
    "dtype": "q4f16",
    "kv_cache_dtype": {
      "q4f16": "float16",
      "fp16": "float16"
    },
    "use_external_data_format": {
      "model.onnx": 1,
      "model_fp16.onnx": 1,
      "model_q4.onnx": 1,
      "model_q4f16.onnx": 2
    }
  }
```

### Reproduction

1. Visit https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4gqa.html
2. Check logs in developer tools on Windows
3. Visit https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4mha.html 
4. Check logs in developer tools on Windows

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Phi-4 Mini Support on Transformers.js v4 #1460

Environment/Platform

Description

Error Log:

Test Cases:

Reproduction

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Phi-4 Mini Support on Transformers.js v4 #1460

Description

Environment/Platform

Description

Error Log:

Test Cases:

Reproduction

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions