Adapting BiRefNet for Multi-Class (7+ Classes) High-Resolution Semantic Segmentation

First and foremost, I want to express my sincere gratitude for your outstanding work on BiRefNet.

I am currently working on a multi-class semantic segmentation task (7 classes, and potentially more in future experiments) with high-resolution inputs (e.g., 1920×1920, 2048×2048), and I hope to fine-tune BiRefNet for this scenario (replacing the original binary foreground/background classification with multi-class prediction). I would greatly appreciate your guidance on the key modifications needed to adapt BiRefNet for multi-class semantic segmentation, such as:
Changes to the output head (e.g., adjusting channel numbers from 1 to N for N classes, replacing sigmoid with softmax, etc.);
Adjustments to loss functions (e.g., switching from BCE/Dice loss to cross-entropy + multi-class Dice loss);
Modifications to the model’s backbone/feature fusion modules (if any) to better handle multi-class features;
Key hyperparameter tweaks (learning rate, batch size, data augmentation, etc.) for multi-class high-resolution training.

If adapting BiRefNet for multi-class semantic segmentation is relatively complex (e.g., requires major structural changes), could you also recommend other state-of-the-art networks that excel at high-resolution multi-class semantic segmentation (with strong detail preservation)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapting BiRefNet for Multi-Class (7+ Classes) High-Resolution Semantic Segmentation #295

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adapting BiRefNet for Multi-Class (7+ Classes) High-Resolution Semantic Segmentation #295

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions