Skip to content

Adapting BiRefNet for Multi-Class (7+ Classes) High-Resolution Semantic Segmentation #295

@duckzhao

Description

@duckzhao

First and foremost, I want to express my sincere gratitude for your outstanding work on BiRefNet.

I am currently working on a multi-class semantic segmentation task (7 classes, and potentially more in future experiments) with high-resolution inputs (e.g., 1920×1920, 2048×2048), and I hope to fine-tune BiRefNet for this scenario (replacing the original binary foreground/background classification with multi-class prediction). I would greatly appreciate your guidance on the key modifications needed to adapt BiRefNet for multi-class semantic segmentation, such as:
Changes to the output head (e.g., adjusting channel numbers from 1 to N for N classes, replacing sigmoid with softmax, etc.);
Adjustments to loss functions (e.g., switching from BCE/Dice loss to cross-entropy + multi-class Dice loss);
Modifications to the model’s backbone/feature fusion modules (if any) to better handle multi-class features;
Key hyperparameter tweaks (learning rate, batch size, data augmentation, etc.) for multi-class high-resolution training.

If adapting BiRefNet for multi-class semantic segmentation is relatively complex (e.g., requires major structural changes), could you also recommend other state-of-the-art networks that excel at high-resolution multi-class semantic segmentation (with strong detail preservation)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions