[Bug]: Exported end2end ONNX model produces poor COCO validation results

### Before Reporting

- [X] I have pulled the latest code of main branch to run again and the bug still existed. 我已经拉取了主分支上最新的代码，重新运行之后，问题仍不能解决。

- [X] I have read the [README](https://github.com/tinyvision/DAMO-YOLO/blob/master/README.md) carefully and no error occured during the installation process. (Otherwise, we recommand that you can ask a question using the Question template) 我已经仔细阅读了README上的操作指引，并且在安装过程中没有错误发生。（否则，我们建议您使用Question模板向我们进行提问）


### Search before reporting

- [X] I have searched the DAMO-YOLO [issues](https://github.com/tinyvision/DAMO-YOLO/issues) and found no similar bugs. 我已经在[issue列表](https://github.com/tinyvision/DAMO-YOLO/issues)中搜索但是没有发现类似的bug报告。


### OS

Ubuntu 24.04

### Device

RTX 4090

### CUDA version

12.5

### TensorRT version

_No response_

### Python version

3.10

### PyTorch version

1.13.1

### torchvision version

0.14.1

### Describe the bug

I have been attempting to train and export a tinynasL25_s model on the COCO dataset, but getting terrible results from the exported model. I have exported the model end2end, which if I am interpreting the code correctly should have given me at most 100 detections after NMS, but in many cases I am still getting 1000+ detections. Detections do however appear to be filtered by a minimum confidence score of 0.05.

After 60 epochs of training on the COCO dataset I get an evaluation score of:
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.365
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.514
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.394
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.403
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.488
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.320
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.549
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.613
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.424
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.673
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.771

I export this model using:
python tools/converter.py -f configs/damoyolo_tinynasL25_S.py -c workdirs/damoyolo_tinynasL25_S/epoch_60_ckpt.pth --batch_size 1  --img_size 640 --end2end --ort

But when evaluating the exported model on COCO I get the following results:
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.008
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.017
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.003
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.047

I have debugged the pre-processing and post-processing steps I have added for running the ONNX model and it looks consistent with the demo script. I am using the same COCO validation scripts for YOLOX and RT-DETR models and have no issues with those. Looks to me like something must be wrong with the export script.

### To Reproduce

1. Train model using configs/damoyolo_tinynasL25_S.py
2. Export model to ONNX using: python tools/converter.py -f configs/damoyolo_tinynasL25_S.py -c workdirs/damoyolo_tinynasL25_S/epoch_60_ckpt.pth --batch_size 1  --img_size 640 --end2end --ort
3. Evaluate ONNX model on COCO validation set

### Hyper-parameters/Configs

_No response_

### Logs

_No response_

### Screenshots

_No response_

### Additional

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Exported end2end ONNX model produces poor COCO validation results #146

Before Reporting

Search before reporting

OS

Device

CUDA version

TensorRT version

Python version

PyTorch version

torchvision version

Describe the bug

To Reproduce

Hyper-parameters/Configs

Logs

Screenshots

Additional

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Exported end2end ONNX model produces poor COCO validation results #146

Description

Before Reporting

Search before reporting

OS

Device

CUDA version

TensorRT version

Python version

PyTorch version

torchvision version

Describe the bug

To Reproduce

Hyper-parameters/Configs

Logs

Screenshots

Additional

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions