Skip to content

Phi-4 Mini Support on Transformers.js v4 #1460

@ibelem

Description

@ibelem

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

I've found ONNX models for Phi-4-mini available at:

https://huggingface.co/onnx-community/Phi-4-mini-instruct-ONNX-MHA/
https://huggingface.co/onnx-community/Phi-4-mini-instruct-ONNX-GQA/

However, when attempting to use these models with Transformers.js, I'm encountering errors. Could you please help identify if I'm missing something? Thank you! @xenova

Error Log:

Transformers.js v4 (self-built before offical release, commit 125a8fd + ORT Web WebGPU 1.24.0-dev.20251104-75d35474d5):

ort.webgpu.bundle.min.mjs:10
    Uncaught Error: Can't create a session. ERROR_CODE: 6, ERROR_MESSAGE: Data of TensorProto ( tensor name: ) should be stored in model_q4f16.onnx_data_1, but it is not regular file.
        at async g (onnx.js:153:11)
        at async models.js:359:25
        at async Promise.all (index 0)
        at async z (models.js:349:19)
        at async Promise.all (index 0)

Test Cases:

  1. https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4gqa.html

Added use_external_data_format in JS code, but seems like the "model_q4f16.onnx": 2 doesn't help.

 const generator = await pipeline(
    "text-generation",
    "onnx-community/Phi-4-mini-instruct-ONNX-GQA",
    {
      device: 'webgpu',
      dtype: 'q4f16',
      use_external_data_format: {
        "model.onnx": 1,
        "model_fp16.onnx": 1,
        "model_q4.onnx": 3,
        "model_q4f16.onnx": 2
      },
      session_options: {
        logSeverityLevel: 0
      },
    }
  );
  1. https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4mha.html
  const generator = await pipeline(
    "text-generation",
    "onnx-community/Phi-4-mini-instruct-ONNX-MHA",
    {
      device: 'webgpu',
      dtype: 'q4f16',
      session_options: {
        logSeverityLevel: 0
      },
    }
  );

Didn't add use_external_data_format in JS config, expected Transformers.js to read the config from config.json:

  "transformers.js_config": {
    "dtype": "q4f16",
    "kv_cache_dtype": {
      "q4f16": "float16",
      "fp16": "float16"
    },
    "use_external_data_format": {
      "model.onnx": 1,
      "model_fp16.onnx": 1,
      "model_q4.onnx": 1,
      "model_q4f16.onnx": 2
    }
  }

Reproduction

  1. Visit https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4gqa.html
  2. Check logs in developer tools on Windows
  3. Visit https://ibelem.github.io/transformersjs-webgpu-phi-test/t4-p4mha.html
  4. Check logs in developer tools on Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions