-
Notifications
You must be signed in to change notification settings - Fork 266
Adapting BiRefNet for Multi-Class (7+ Classes) High-Resolution Semantic Segmentation #295
Description
First and foremost, I want to express my sincere gratitude for your outstanding work on BiRefNet.
I am currently working on a multi-class semantic segmentation task (7 classes, and potentially more in future experiments) with high-resolution inputs (e.g., 1920×1920, 2048×2048), and I hope to fine-tune BiRefNet for this scenario (replacing the original binary foreground/background classification with multi-class prediction). I would greatly appreciate your guidance on the key modifications needed to adapt BiRefNet for multi-class semantic segmentation, such as:
Changes to the output head (e.g., adjusting channel numbers from 1 to N for N classes, replacing sigmoid with softmax, etc.);
Adjustments to loss functions (e.g., switching from BCE/Dice loss to cross-entropy + multi-class Dice loss);
Modifications to the model’s backbone/feature fusion modules (if any) to better handle multi-class features;
Key hyperparameter tweaks (learning rate, batch size, data augmentation, etc.) for multi-class high-resolution training.
If adapting BiRefNet for multi-class semantic segmentation is relatively complex (e.g., requires major structural changes), could you also recommend other state-of-the-art networks that excel at high-resolution multi-class semantic segmentation (with strong detail preservation)?