The VAE model of SDXL, after being compiled with TensorRT, shows that VAE requires 12GB of GPU memory when loaded

## Description


I saw that in the official demo of SDXL, VAE was not compiled. However, when I converted VAE to plan format, the size was 98.22MB. But after loading VAE and using `device_memory_size` to check the VRAM usage, it showed 12079599616 bytes, which means I don't have enough VRAM to load the entire model.

## Environment



**TensorRT Version**:9.2/9.3

**NVIDIA GPU**:RTX 4080

**NVIDIA Driver Version**:535.161.07

**CUDA Version**:12.2

**CUDNN Version**:


Operating System:Ubuntu 22.04.3 LTS

Python Version (if applicable):3.10

Tensorflow Version (if applicable):

PyTorch Version (if applicable):2.1.0

Baremetal or Container (if so, version):


## Relevant Files



**Model link**:https://huggingface.co/stabilityai/stable-diffusion-xl-1.0-tensorrt


## Steps To Reproduce



**Commands or scripts**:

**Have you tried [the latest release](https://developer.nvidia.com/tensorrt)?**:Yes

**Can this model run on other frameworks?** For example run ONNX model with ONNXRuntime (`polygraphy run <model.onnx> --onnxrt`):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The VAE model of SDXL, after being compiled with TensorRT, shows that VAE requires 12GB of GPU memory when loaded #3749

Description

Environment

Relevant Files

Steps To Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The VAE model of SDXL, after being compiled with TensorRT, shows that VAE requires 12GB of GPU memory when loaded #3749

Description

Description

Environment

Relevant Files

Steps To Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions