-
Notifications
You must be signed in to change notification settings - Fork 15
Description
for help: todo yolov8n.onnx with quark0.10,qudestion as:
[QUARK-INFO]: Checking custom ops library ...
[QUARK-WARNING]: The custom ops library D:\program\Anaconda\envs\AMD_Quark\lib\site-packages\quark\onnx\operators\custom_ops\lib\custom_ops.dll does NOT exist.
[QUARK-INFO]: Start compiling CPU version of custom ops library.
[QUARK-ERROR]: CPU version of custom ops library compilation failed:'gbk' codec can't decode byte 0x92 in position 7760: illegal multibyte sequence
[QUARK-WARNING]: Custom ops library compilation failed: CPU version of custom ops library compilation failed:'gbk' codec can't decode byte 0x92 in position 7760: illegal multibyte sequence.
[QUARK-INFO]: Checked custom ops library.
[QUARK-WARNING]: The custom ops library D:\program\Anaconda\envs\AMD_Quark\lib\site-packages\quark\onnx\operators\custom_ops\lib\custom_ops.dll does NOT exist.
[QUARK-WARNING]: Config has been replaced by QConfig. The old API will be removed in the next release.
[QUARK-WARNING]: The algorithm API is algo_config in QConfig. The old API will be removed in the next release.
[QUARK_INFO]: Time information:
2025-12-26 14:26:42.234468
[QUARK_INFO]: OS and CPU information:
system --- Windows
node --- DESKTOP-AK5QEFP
release --- 10
version --- 10.0.19045
machine --- AMD64
processor --- Intel64 Family 6 Model 183 Stepping 1, GenuineIntel
[QUARK_INFO]: Tools version information:
python --- 3.10.19
onnx --- 1.19.0
onnxruntime --- 1.22.1
quark.onnx --- 0.10+b8ad5c1
[QUARK_INFO]: Quantized Configuration information:
model_input --- yolov8n.onnx
model_output --- yolov8n_quantized.onnx
calibration_data_reader --- <main.ImageDataReader object at 0x00000287C75B7A90>
calibration_data_path --- None
quant_format --- QDQ
input_nodes --- []
output_nodes --- []
op_types_to_quantize --- []
extra_op_types_to_quantize --- []
per_channel --- False
reduce_range --- False
activation_type --- QInt8
weight_type --- QInt8
nodes_to_quantize --- []
nodes_to_exclude --- []
subgraphs_to_exclude --- []
optimize_model --- True
use_external_data_format --- False
calibrate_method --- CalibrationMethod.MinMax
execution_providers --- ['CPUExecutionProvider']
enable_npu_cnn --- False
enable_npu_transformer --- False
specific_tensor_precision --- False
debug_mode --- False
convert_fp16_to_fp32 --- False
convert_nchw_to_nhwc --- False
include_cle --- True
include_sq --- False
include_rotation --- False
include_fast_ft --- False
extra_options --- {'ActivationSymmetric': True, 'AlignSlice': False, 'FoldRelu': True, 'AlignConcat': True}
[QUARK-INFO]: The input ONNX model can create InferenceSession successfully
[QUARK-INFO]: Removed initializers from input
[QUARK-INFO]: Simplified model sucessfully
[QUARK-INFO]: Duplicate the shared bias initializers in the model for separate quantization use across different nodes!
[QUARK-INFO]: Loading model...
[QUARK-INFO]: The input ONNX model can run inference successfully
[QUARK-INFO]: Start CrossLayerEqualization...
[QUARK-INFO]: CrossLayerEqualization pattern num: 0
[QUARK-INFO]: Total CrossLayerEqualization steps: 1
[QUARK-INFO]: CrossLayerEqualization Done.
[QUARK-INFO]: optimize the model for better hardware compatibility.
[QUARK-INFO]: Found Split node Split_12. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_30. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_55. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_80. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_108. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_125. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_143. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_161. Replacing with Slice.
[QUARK-INFO]: Found Split node Split_221. Replacing with Slice.
[QUARK-WARNING]: The opset version is 12 < 17. Skipping fusing layer normalization.
[QUARK-WARNING]: The opset version is 12 < 20. Skipping fusing Gelu.
[QUARK-INFO]: Start running calibration on 300 samples with extra options {}...
[QUARK-INFO]: Data collection of CalibrationMethod.MinMax in progress. Runtime will depend on your model and data size.
[QUARK-INFO]: The calibration has been finished. It took 15.6s to complete.
[QUARK-INFO]: Remove QuantizeLinear & DequantizeLinear on certain operations(such as conv-relu).
[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints
[QUARK-INFO]: Have aligned Concat node Concat_20 inputs
[QUARK-INFO]: Have aligned Concat node Concat_45 inputs
[QUARK-INFO]: Have aligned Concat node Concat_70 inputs
[QUARK-INFO]: Have aligned Concat node Concat_88 inputs
[QUARK-INFO]: Have aligned Concat node Concat_104 inputs
[QUARK-INFO]: Have aligned Concat node Concat_115 inputs
[QUARK-INFO]: Have aligned Concat node Concat_121 inputs
[QUARK-INFO]: Have aligned Concat node Concat_132 inputs
[QUARK-INFO]: Have aligned Concat node Concat_139 inputs
[QUARK-INFO]: Have aligned Concat node Concat_186 inputs
[QUARK-INFO]: Have aligned Concat node Concat_150 inputs
[QUARK-INFO]: Have aligned Concat node Concat_157 inputs
[QUARK-INFO]: Have aligned Concat node Concat_201 inputs
[QUARK-INFO]: Have aligned Concat node Concat_168 inputs
[QUARK-INFO]: Have aligned Concat node Concat_216 inputs
[QUARK-INFO]: Have aligned Concat node Concat_220 inputs
[QUARK-INFO]: Have aligned Concat node Concat_250 inputs
[QUARK-INFO]: Have aligned Concat node Concat_254 inputs
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_13.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_31.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_38.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_49.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_56.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_63.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_74.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_81.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_109.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_126.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_144.
[QUARK-INFO]: Have adjusted bias scale == activation scale * weights scale in QDQ of Conv_162.
[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints
[QUARK-INFO]: Have aligned Concat node Concat_104 inputs
[QUARK-INFO]: Have aligned Concat node Concat_186 inputs
[QUARK-INFO]: Have aligned Concat node Concat_157 inputs
[QUARK-INFO]: Have aligned Concat node Concat_201 inputs
[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints
[QUARK-INFO]: Have aligned Concat node Concat_104 inputs
[QUARK-INFO]: Have aligned Concat node Concat_157 inputs
[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints
[QUARK-INFO]: Have aligned Concat node Concat_104 inputs
[QUARK-INFO]: Have aligned Concat node Concat_157 inputs
[QUARK-WARNING]: The number of adjustments has reached the limit. Please check the model
[QUARK-INFO]: Adjust the quantize info to meet the compiler constraints
[QUARK-INFO]: Have aligned Concat node Concat_104 inputs
[QUARK-INFO]: Have aligned Concat node Concat_157 inputs
[QUARK-INFO]: Converted 618/618 custom QDQs to contributed QDQs
[QUARK-INFO]: The operation types and their corresponding quantities of the input float model is shown in the table below.
[QUARK-INFO]: The quantized information for all operation types is shown in the table below.
┌──────────────────────┬────────────────────────┐
│ Op Type │ Float Model │
├──────────────────────┼────────────────────────┤
│ Identity │ 3 │
│ Conv │ 64 │
│ Sigmoid │ 58 │
│ Mul │ 60 │
│ Split │ 9 │
│ Add │ 9 │
│ Concat │ 19 │
│ MaxPool │ 3 │
│ Constant │ 12 │
│ Resize │ 2 │
│ Reshape │ 5 │
│ Transpose │ 2 │
│ Softmax │ 1 │
│ Shape │ 1 │
│ Gather │ 1 │
│ Div │ 2 │
│ Slice │ 2 │
│ Sub │ 2 │
├──────────────────────┼────────────────────────┤
│ Quantized model path │ yolov8n_quantized.onnx │
└──────────────────────┴────────────────────────┘
┌───────────┬────────────┬──────────┬───────────┐
│ Op Type │ Activation │ Weights │ Bias │
├───────────┼────────────┼──────────┼───────────┤
│ Conv │ INT8(64) │ INT8(64) │ INT32(63) │
│ Sigmoid │ INT8(58) │ │ │
│ Mul │ INT8(58) │ │ │
│ Slice │ INT8(20) │ │ │
│ Add │ INT8(8) │ │ │
│ Concat │ INT8(19) │ │ │
│ MaxPool │ INT8(3) │ │ │
│ Resize │ INT8(2) │ │ │
│ Reshape │ INT8(5) │ │ │
│ Transpose │ INT8(2) │ │ │
│ Softmax │ INT8(1) │ │ │
│ Sub │ INT8(2) │ │ │
│ Div │ INT8(1) │ │ │
└───────────┴────────────┴──────────┴───────────┘
[QUARK-INFO]: The discrepancy between the operation types in the quantized model and the float model is due to the application of graph optimization.
Process finished with exit code 0