-
Notifications
You must be signed in to change notification settings - Fork 344
Description
There have been discussions about how num_classes inferred from dataset is increased by 1 internally when creating the model head. However I don't think this is actually happening in my case.
I have a custom dataset with 33 categories, category_ids ranging 1-33. When launching training, it reports a Namespace(num_classes=33....
in the log. Assuming the increment, I would expect the model to have a total num_classes of 34. After training, I run following code for onnx export
from rfdetr import RFDETRMedium
model = RFDETRMedium(num_classes=33, pretrain_weights='training_output/checkpoint.pth')
model.export()
the export happens with following log
Using a different number of positional encodings than DINOv2, which means we're not loading DINOv2 backbone weights. This is not a problem if finetuning a pretrained RF-DETR model.
Using patch size 16 instead of 14, which means we're not loading DINOv2 backbone weights. This is not a problem if finetuning a pretrained RF-DETR model.
Loading pretrain weights
num_classes mismatch: pretrain weights has 32 classes, but your model has 33 classes
reinitializing detection head with 32 classes
I have read that the reported num_classes in checkpoint is decreased by 1, which means that the checkpoint actually has 33 classes. The resulting onnx model also has shape of labels
=[300, 33]. Why not 34?
When running inference, the predicted labels seem to be 1-based (so that class_id 1 is actually corresponding to the category_id=1 from my annotations). To me this implies we have
- a dummy value at position 0 of the output tensor and
- no predictions ever for category_id=33 which would be at position 33, but in a 0-based tensor of length 33 it is out of index (0-32).