Skip to content

[QUARK-ERROR]: CPU version of custom ops library compilation failed:'gbk' codec can't decode byte 0x92 in position 7760: illegal multibyte sequence #17

@yohnyang

Description

@yohnyang

for help: todo yolov8n.onnx with quark0.10,qudestion as:

[QUARK-INFO]: Checking custom ops library ...

[QUARK-WARNING]: The custom ops library D:\program\Anaconda\envs\AMD_Quark\lib\site-packages\quark\onnx\operators\custom_ops\lib\custom_ops.dll does NOT exist.

[QUARK-INFO]: Start compiling CPU version of custom ops library.

[QUARK-ERROR]: CPU version of custom ops library compilation failed:'gbk' codec can't decode byte 0x92 in position 7760: illegal multibyte sequence

[QUARK-WARNING]: Custom ops library compilation failed: CPU version of custom ops library compilation failed:'gbk' codec can't decode byte 0x92 in position 7760: illegal multibyte sequence.

[QUARK-INFO]: Checked custom ops library.

[QUARK-WARNING]: The custom ops library D:\program\Anaconda\envs\AMD_Quark\lib\site-packages\quark\onnx\operators\custom_ops\lib\custom_ops.dll does NOT exist.

[QUARK-WARNING]: Config has been replaced by QConfig. The old API will be removed in the next release.

[QUARK-WARNING]: The algorithm API is algo_config in QConfig. The old API will be removed in the next release.
[QUARK_INFO]: Time information:
2025-12-26 14:26:42.234468
[QUARK_INFO]: OS and CPU information:
system --- Windows
node --- DESKTOP-AK5QEFP
release --- 10
version --- 10.0.19045
machine --- AMD64
processor --- Intel64 Family 6 Model 183 Stepping 1, GenuineIntel
[QUARK_INFO]: Tools version information:
python --- 3.10.19
onnx --- 1.19.0
onnxruntime --- 1.22.1
quark.onnx --- 0.10+b8ad5c1
[QUARK_INFO]: Quantized Configuration information:
model_input --- yolov8n.onnx
model_output --- yolov8n_quantized.onnx
calibration_data_reader --- <main.ImageDataReader object at 0x00000287C75B7A90>
calibration_data_path --- None
quant_format --- QDQ
input_nodes --- []
output_nodes --- []
op_types_to_quantize --- []
extra_op_types_to_quantize --- []
per_channel --- False
reduce_range --- False
activation_type --- QInt8
weight_type --- QInt8
nodes_to_quantize --- []
nodes_to_exclude --- []
subgraphs_to_exclude --- []
optimize_model --- True
use_external_data_format --- False
calibrate_method --- CalibrationMethod.MinMax
execution_providers --- ['CPUExecutionProvider']
enable_npu_cnn --- False
enable_npu_transformer --- False
specific_tensor_precision --- False
debug_mode --- False
convert_fp16_to_fp32 --- False
convert_nchw_to_nhwc --- False
include_cle --- True
include_sq --- False
include_rotation --- False
include_fast_ft --- False
extra_options --- {'ActivationSymmetric': True, 'AlignSlice': False, 'FoldRelu': True, 'AlignConcat': True}

[QUARK-INFO]: The input ONNX model can create InferenceSession successfully

[QUARK-INFO]: Removed initializers from input

[QUARK-INFO]: Simplified model sucessfully

[QUARK-INFO]: Duplicate the shared bias initializers in the model for separate quantization use across different nodes!

[QUARK-INFO]: Loading model...

[QUARK-INFO]: The input ONNX model can run inference successfully

[QUARK-INFO]: Start CrossLayerEqualization...

[QUARK-INFO]: CrossLayerEqualization pattern num: 0

[QUARK-INFO]: Total CrossLayerEqualization steps: 1

[QUARK-INFO]: CrossLayerEqualization Done.

[QUARK-INFO]: optimize the model for better hardware compatibility.

[QUARK-INFO]: Found Split node Split_12. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_30. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_55. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_80. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_108. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_125. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_143. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_161. Replacing with Slice.

[QUARK-INFO]: Found Split node Split_221. Replacing with Slice.

[QUARK-WARNING]: The opset version is 12 < 17. Skipping fusing layer normalization.

[QUARK-WARNING]: The opset version is 12 < 20. Skipping fusing Gelu.

[QUARK-INFO]: Start running calibration on 300 samples with extra options {}...

[QUARK-INFO]: Data collection of CalibrationMethod.MinMax in progress. Runtime will depend on your model and data size.

[QUARK-INFO]: The calibration has been finished. It took 15.6s to complete.

[QUARK-INFO]: Remove QuantizeLinear & DequantizeLinear on certain operations(such as conv-relu).

[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints

[QUARK-INFO]: Have aligned Concat node Concat_20 inputs

[QUARK-INFO]: Have aligned Concat node Concat_45 inputs

[QUARK-INFO]: Have aligned Concat node Concat_70 inputs

[QUARK-INFO]: Have aligned Concat node Concat_88 inputs

[QUARK-INFO]: Have aligned Concat node Concat_104 inputs

[QUARK-INFO]: Have aligned Concat node Concat_115 inputs

[QUARK-INFO]: Have aligned Concat node Concat_121 inputs

[QUARK-INFO]: Have aligned Concat node Concat_132 inputs

[QUARK-INFO]: Have aligned Concat node Concat_139 inputs

[QUARK-INFO]: Have aligned Concat node Concat_186 inputs

[QUARK-INFO]: Have aligned Concat node Concat_150 inputs

[QUARK-INFO]: Have aligned Concat node Concat_157 inputs

[QUARK-INFO]: Have aligned Concat node Concat_201 inputs

[QUARK-INFO]: Have aligned Concat node Concat_168 inputs

[QUARK-INFO]: Have aligned Concat node Concat_216 inputs

[QUARK-INFO]: Have aligned Concat node Concat_220 inputs

[QUARK-INFO]: Have aligned Concat node Concat_250 inputs

[QUARK-INFO]: Have aligned Concat node Concat_254 inputs

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_13.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_31.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_38.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_49.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_56.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_63.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_74.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_81.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_109.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_126.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_144.

[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_162.

[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints

[QUARK-INFO]: Have aligned Concat node Concat_104 inputs

[QUARK-INFO]: Have aligned Concat node Concat_186 inputs

[QUARK-INFO]: Have aligned Concat node Concat_157 inputs

[QUARK-INFO]: Have aligned Concat node Concat_201 inputs

[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints

[QUARK-INFO]: Have aligned Concat node Concat_104 inputs

[QUARK-INFO]: Have aligned Concat node Concat_157 inputs

[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints

[QUARK-INFO]: Have aligned Concat node Concat_104 inputs

[QUARK-INFO]: Have aligned Concat node Concat_157 inputs

[QUARK-WARNING]: The number of adjustments has reached the limit. Please check the model

[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints

[QUARK-INFO]: Have aligned Concat node Concat_104 inputs

[QUARK-INFO]: Have aligned Concat node Concat_157 inputs

[QUARK-INFO]: Converted 618/618 custom QDQs to contributed QDQs

[QUARK-INFO]: The operation types and their corresponding quantities of the input float model is shown in the table below.

[QUARK-INFO]: The quantized information for all operation types is shown in the table below.
┌──────────────────────┬────────────────────────┐
│ Op Type │ Float Model │
├──────────────────────┼────────────────────────┤
│ Identity │ 3 │
│ Conv │ 64 │
│ Sigmoid │ 58 │
│ Mul │ 60 │
│ Split │ 9 │
│ Add │ 9 │
│ Concat │ 19 │
│ MaxPool │ 3 │
│ Constant │ 12 │
│ Resize │ 2 │
│ Reshape │ 5 │
│ Transpose │ 2 │
│ Softmax │ 1 │
│ Shape │ 1 │
│ Gather │ 1 │
│ Div │ 2 │
│ Slice │ 2 │
│ Sub │ 2 │
├──────────────────────┼────────────────────────┤
│ Quantized model path │ yolov8n_quantized.onnx │
└──────────────────────┴────────────────────────┘
┌───────────┬────────────┬──────────┬───────────┐
│ Op Type │ Activation │ Weights │ Bias │
├───────────┼────────────┼──────────┼───────────┤
│ Conv │ INT8(64) │ INT8(64) │ INT32(63) │
│ Sigmoid │ INT8(58) │ │ │
│ Mul │ INT8(58) │ │ │
│ Slice │ INT8(20) │ │ │
│ Add │ INT8(8) │ │ │
│ Concat │ INT8(19) │ │ │
│ MaxPool │ INT8(3) │ │ │
│ Resize │ INT8(2) │ │ │
│ Reshape │ INT8(5) │ │ │
│ Transpose │ INT8(2) │ │ │
│ Softmax │ INT8(1) │ │ │
│ Sub │ INT8(2) │ │ │
│ Div │ INT8(1) │ │ │
└───────────┴────────────┴──────────┴───────────┘

[QUARK-INFO]: The discrepancy between the operation types in the quantized model and the float model is due to the application of graph optimization.

Process finished with exit code 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions