Description
I saw that in the official demo of SDXL, VAE was not compiled. However, when I converted VAE to plan format, the size was 98.22MB. But after loading VAE and using device_memory_size to check the VRAM usage, it showed 12079599616 bytes, which means I don't have enough VRAM to load the entire model.
Environment
TensorRT Version:9.2/9.3
NVIDIA GPU:RTX 4080
NVIDIA Driver Version:535.161.07
CUDA Version:12.2
CUDNN Version:
Operating System:Ubuntu 22.04.3 LTS
Python Version (if applicable):3.10
Tensorflow Version (if applicable):
PyTorch Version (if applicable):2.1.0
Baremetal or Container (if so, version):
Relevant Files
Model link:https://huggingface.co/stabilityai/stable-diffusion-xl-1.0-tensorrt
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?:Yes
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):