RF-DETR 1.1.0 vs 1.2.1 : same train parameters doesn't work after updating

### Search before asking

- [x] I have searched the RF-DETR issues and found no similar bug report.


### Bug

Hello ! 

I'm about to start a new training session for my custom model. However, I encounter a problem when I use the latest version (1.2.1) vs when I used the older 1.1.0 version. Here are my parameters, please note that I'm exploring on cpu and system memory for now as I want to be sure everything works before renting a server :

```
model.train(dataset_dir=data_dir,
            device='cpu',
            num_workers=1,
            epochs=num_epochs, 
            batch_size=1, 
            grad_accum_steps=16,
            lr=learn, 
            output_dir=output_path, 
            resolution=1008, 
            weight_decay=1e-8, 
            checkpoint_interval=cp_interval, 
            resume=cp_resume,
            early_stopping=True,
            early_stopping_patience=15,
            num_queries=300, 
            num_select=300, 
            dec_layers=6
            )
```

I don't have any problem when training wit these parameters in the old version (I managed to start my epoch, even if it was very long on cpu) but the latest version returns the following error : `RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.`

I don't know where it comes from, and the [full errors output](https://pastebin.com/PnjxtBNg) doesn't really help to know how to fix. Any idea what's happening ?

Thank you for reading !


### Environment

RF-DETR : 1.1.0/1.2.1
OS : ZorinOS 17.3 core (equivalent Ubuntu LTE 22.
Python : 3.10 
PyTorch : 2.6.0 version
CPU :  core i5 11400H


### Minimal Reproducible Example
```
#run this with rf-detr 1.1.0 the 1.2.1 to see the difference

device = torch.device('cpu')
data_dir = os.path.join(args.path)
output_path = os.path.join(args.output)
model = RFDETRLarge(pretrained=True)
num_epochs = args.epochs
learn = args.learning_rate
cp_interval = args.checkpoint_interval
model.train(dataset_dir=data_dir,
                device='cpu',
                num_workers=1,
                epochs=num_epochs, 
                batch_size=1, 
                grad_accum_steps=16,
                lr=learn, 
                output_dir=output_path, 
                resolution=1008, 
                weight_decay=1e-8, 
                checkpoint_interval=cp_interval, 
                resume=cp_resume,
                early_stopping=True,
                early_stopping_patience=15,
                num_queries=300, 
                num_select=300, 
                dec_layers=6 
                )
```

### Additional

_No response_

### Are you willing to submit a PR?

- [ ] Yes, I'd like to help by submitting a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RF-DETR 1.1.0 vs 1.2.1 : same train parameters doesn't work after updating #301

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RF-DETR 1.1.0 vs 1.2.1 : same train parameters doesn't work after updating #301

Description

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions