onnx model inference bug

when use triton infer paddle's onnx model, gpu memory increase huge，very slow
docker version:nvcr.io/nvidia/tritonserver:23.06-py3
tritonclient-2.58.0

config.pbtxt
name: "paddleocr_v5"
backend: "onnxruntime"
default_model_filename: "model.onnx"