-
Notifications
You must be signed in to change notification settings - Fork 748
Open
Description
I was hoping to be able to come up with a design that leverages DinoV3 (smallest distilled transformer) features and produces relatively precise segmentation masks. I'm wondering if anyone has any recommendations since:
- Linear probing or any head that works with intermediate layers operates on 1/16 resolution features, which results in coarse masks
- Mask2Former approach is quite heavy, defeating the purpose of using a lightweight distilled model
I'd appreciate any advice.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels