using QDQonnx  export engine file,fp16and int8，speed is not faster than using ONNX withouout QDQ to export fp16 engine

Hello! I used mtq.quantize to quantize the RFDETR model and exported the QDQ ONNX. However, after converting it to an engine, the models exported with --int8 and --fp16, although smaller in size than the original fp16 engine file, have almost the same inference speed. Could you please suggest any solutions? Thank you.