Skip to content

Error when quantizing onnx model #30

@de1star

Description

@de1star

Hi,
This error occurred when I tried to quantize my onnx model.

Traceback (most recent call last):
  File "quant.py", line 4, in <module>
    quantize(
  File "/usr/local/lib/python3.8/dist-packages/modelopt/onnx/quantization/quantize.py", line 207, in quantize
    onnx_model = quantize_func(
  File "/usr/local/lib/python3.8/dist-packages/modelopt/onnx/quantization/int8.py", line 186, in quantize
    quantize_static(
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quantize.py", line 513, in quantize_static
    calibrator.collect_data(calibration_data_reader)
  File "/usr/local/lib/python3.8/dist-packages/modelopt/onnx/quantization/ort_patching.py", line 271, in _collect_data_histogram_calibrator
    calibrator.intermediate_outputs.append(calibrator.infer_session.run(None, inputs))
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : CopyTensorAsync is not implemented

I have installed onnxruntime-gpu for cuda 12.x by https://onnxruntime.ai/docs/install/.
Could you help with that?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions