Skip to content

failure of SigLIP2 FP32 to FP16 #4373

@yijun02

Description

@yijun02

I am trying to convert an SigLIP2 model to TensorRT and use fp16, but the cosine similarity between onnx and trt is 0.6463.

I used the following code convert to onnx.

import torch
import torch.nn as nn
import torch.nn.functional as F
from open_clip import create_model_from_pretrained
import subprocess
from urllib.request import urlopen
from PIL import Image
import numpy as np

model_path = "model"

# load model
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = create_model_from_pretrained('hf-hub:timm/ViT-B-16-SigLIP2-256', device=device)
model.eval()
# export image encoder
class ImageEncoder(nn.Module):
    def __init__(self, model) -> None:
        super().__init__()
        self.model = model
    @torch.no_grad()
    def forward(self, image):
        image = (image-127.5)/127.5
        image = image.permute(0, 3, 1, 2)
        image_features = model.encode_image(image)
        return image_features

image_encoder = ImageEncoder(model)
dummy_img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
dummy_img = np.array(dummy_img.resize((256, 256)).convert('RGB')).astype(np.float32)
dummy_img = torch.from_numpy(dummy_img).unsqueeze(0).to(device)

torch.onnx.export(image_encoder,
                  (dummy_img),
                  f"{model_path}/img_en_ori.onnx",
                  export_params=True,
                  opset_version=16,
                  do_constant_folding=True,
                  input_names = ['img'],
                  output_names = ['image_feature'])

subprocess.run(["onnxsim", f"{model_path}/img_en_ori.onnx", f"{model_path}/img_en_ori.onnx"], check=True)

and use the command to fp16 trt engine.

/usr/src/tensorrt/bin/trtexec --onnx=model/img_en_ori.onnx --saveEngine=model/img_en_ori.engine --fp16

Environment

AGX with dustynv/l4t-pytorch:r36.4.0

NX with dustynv/l4t-pytorch:2.2-r35.4.1

ubuntu 22.04, RTX 3090 with nvcr.io/nvidia/pytorch:25.01-py3

Metadata

Metadata

Assignees

No one assigned

    Labels

    Module:AccuracyOutput mismatch between TensorRT and other frameworkstriagedIssue has been triaged by maintainers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions