How to use half precision ONNX models?

### Question

Hi,

I just exported a detection model with fp16 using optimum.
`--dtype fp16 `

This is my pipeline:

```javascript
const model = await AutoModel.from_pretrained(
  "./onnx_llama",
  { dtype: "fp16", device: "cpu" } 
const processor = await AutoProcessor.from_pretrained("./onnx_llama");
const { pixel_values, reshaped_input_sizes } = await processor(image);
const buffer = await fs.readFile("image3.jpg");
const blob = new Blob([buffer]);

const image = await RawImage.fromBlob(blob);
const { pixel_values, reshaped_input_sizes } = await processor(image);
const { output0 } = await model({ pixel_values: tensor });
```
Using this results in:
An error occurred during model execution: "Error: Unexpected input data type. Actual: (tensor(float)) , expected: (tensor(float16))".

Which makes sense, however when i try to convert to fp16 "manually"

```javascript
const fp16data = Float16Array.from(pixel_values.data); //float32ArrayToUint16Array(pixel_values.data);
const tensor = new Tensor("float16", fp16data, pixel_values.dims);
const { output0 } = await model({ pixel_values:tensor });
```

I get:
`Tensor.data must be a typed array (4) for float16 tensors, but got typed array (0).`

What's going on here? I tried to converting the `pixel_data.data` to a UInt16Array manually but that has no effect as it gets converted to a Float16Array in the tensor constructor anyway.

Help is much appreciated!

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to use half precision ONNX models? #1447

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to use half precision ONNX models? #1447

Description

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions