BalanceBenchmark: A Survey for Multimodal Imbalance Learning

Paper

BalanceBenchmark: A Survey for Multimodal Imbalance Learning
Shaoxuan Xu, Menglu Cui, Chengxiang Huang, Hongfa Wang and Di Hu

If you find this repository useful, please cite our paper and corresponding toolkit:

@misc{xu2025balancebenchmarksurveymultimodalimbalancelearning,
      title={BalanceBenchmark: A Survey for Multimodal Imbalance Learning}, 
      author={Shaoxuan Xu and Menglu Cui and Chengxiang Huang and Hongfa Wang and Di Hu},
      year={2025},
      eprint={2502.10816},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2502.10816}, 
}

Overview

Multimodal learning has gained attention for its capacity to integrate information from different modalities. However, it is often hindered by the multimodal imbalance problem, where certain modalities disproportionately dominate while others remain underutilized. Although recent studies have proposed various methods to alleviate this problem, they lack comprehensive and fair comparisons. To facilitate this field, we introduce BalanceBenchmark, a systematic and unified benchmark for evaluating multimodal imbalance learning methods. BalanceBenchmark spans 17 algorithms and 7 datasets, providing a comprehensive framework for method evaluation and comparison.

To accompany BalanceBenchmark, we release BalanceMM, a standardized toolkit that implements 17 state-of-the-art approaches spanning four research directions: data-level adjustments, feed-forward modifications, objective adaptations, and optimization-based methods. The toolkit provides a standardized pipeline that unifies innovations in fusion paradigms, optimization objectives, and training approaches. Our toolkit simplifies the research workflow through:

Standardized data loading for 7 multimodal datasets
Unified implementation of various imbalance learning methods
Automated experimental pipeline from training to evaluation
Comprehensive metrics for assessing performance, imbalance degree, and complexity

BalanceMM is designed with modularity and extensibility in mind, enabling easy integration of new methods and datasets. It provides researchers with the necessary tools to reproduce experiments, conduct fair comparisons, and develop new approaches for addressing the multimodal imbalance problem.

Datasets currently supported

Audio-Visual: KineticsSounds, CREMA-D, BalancedAV, VGGSound
RGB-Optical Flow: UCF-101
Image-Text: FOOD-101
Audio-Visual-Text: CMU-MOSEI

To add a new dataset:

Go to balancemm/datasets/
Create a new Python file and a new dataset class
Implement the required data loading and preprocessing methods in the corresponding_dataset.py file
Add configuration file in balancemm/configs/dataset_config.yaml

Algorithms currently supported

Data-level methods: Modality-valuation
Feed-forward methods: MLA, OPM, Greedy, AMCo
Objective methods: MMCosine, UMT, MBSD, CML, MMPareto, GBlending, LFM
Optimization methods: OGM, AGM, PMR, Relearning, ReconBoost

See Section 3 in our paper for detailed descriptions of each method.

To add a new method:

Determine which category your method belongs to:

"Data" : methods that adjust data processing
"Feed-forward" : methods that modify network architecture
"Objective" : methods that adapt learning objectives
"Optimization" : methods that adjust optimization process

Go to balancemm/trainer/
Create a new Python file implementing your method
Implement the corresponding_trainer.py file based on base_trainer.py, you should rewrite trainer.training_step usually.
Other implementation by your method's category:

If your method belongs to "Data", go to balancemm/datasets/__init.py and modify properly.
If your method belongs to "Feed-forward", go to balancemm/models/avclassify_model.py, create a new model class and rewrite specific functions.
If your method belongs to "Objective", you mostly don't have to do other modification except traienr.
If your method belongs to "Optimization", you may need to modify trainer or any parts mentioned above.
You can also modify any combination of the parts metioned above according to your method.

Add configuration file in balancemm/configs/trainer_config.yaml

Installation

git clone https://github.com/GeWu-Lab/BalanceBenchmark.git
cd BalanceBenchmark
conda create -n balancemm python=3.10
conda activate balancemm
pip install torch==1.12.1+cu113
pip install -r requirements.txt
pip install lightning==2.0.0
pip install lightning-cloud==0.5.68
pip install lightning-utilities==0.11.2

Experiment

To run experiments, you'll need to download the datasets from their open-sourced links. After downloading, place the datasets in your preferred directory and update the dataset path in your configuration file.

You can run any experiment using a single command line:

python -m balancemm \
    --trainer [trainer_name] \
    --dataset [dataset_name] \
    --model [model_name] \
    --hyper-params [param_file.yaml] \
    --device [0/cpu]

For example, to run OGM on CREMA-D dataset:

python -m balancemm \
    --trainer OGM \
    --dataset CREMAD \
    --model BaseClassifier \
    --alpha 0.5 \
    --device 0

Results

We have conducted comprehensive experiments using the proposed BanlenceBenchmark on 7 datasets. The results indicate that almost all related methods outperform the Baseline in terms of accuracy and F1 score, demonstrating that the multimodal imbalance problem is prevalent across various scenarios.

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
balancemm		balancemm
configs		configs
images		images
.gitignore		.gitignore
README.md		README.md
environment		environment
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BalanceBenchmark: A Survey for Multimodal Imbalance Learning

Paper

Overview

Datasets currently supported

Algorithms currently supported

Installation

Experiment

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

GeWu-Lab/BalanceBenchmark

Folders and files

Latest commit

History

Repository files navigation

BalanceBenchmark: A Survey for Multimodal Imbalance Learning

Paper

Overview

Datasets currently supported

Algorithms currently supported

Installation

Experiment

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages