MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction
If you find our paper useful in your research, please consider citing:
- Conference Paper (APSIPA ASC 2025):
@inproceedings{ju2025efficient,
title={Efficient Generative Adversarial Networks for Color Document Image Enhancement and Binarization Using Multi-Scale Feature Extraction},
author={Ju, Rui-Yang and Wong, KokSheik and Chiang, Jen-Shiun},
booktitle={2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
pages={1898--1903},
year={2025},
organization={IEEE}
}
- Journal Paper (Under Review):
@article{ju2025mfegan,
title={MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction},
author={Ju, Rui-Yang and Wong, KokSheik and Jin, Yanlin and Chiang, Jen-Shiun},
journal={arXiv preprint arXiv:2512.14114},
year={2025}
}
You can download the dataset used in this experiment from OneDrive.
-
Our training set (143) includes:
DIBCO 2009 (10), H-DIBCO 2010 (10), H-DIBCO 2012 (14), Bickley Diary (7), PHIBD (15), SMADI (87).
-
Our test set (102) includes:
DIBCO 2011 (16), DIBCO 2013 (16), H-DIBCO 2014 (10), H-DIBCO 2016 (10), DIBCO 2017 (20), H-DIBCO 2018 (10), DIBCO 2019 (20).
-
Put training set into
./Trainset/, and put test set into./Testset/.
- Two-Fold Cross Validation: I. trainig set (15), test set (20); II. trainig set (20), test set (15).
- Five-Fold Cross Validation: trainig set (4), test set (1).
- NVIDIA GPU + CUDA CuDNN
- Creat a new Conda environment:
conda env create -f environment.yaml
cd unet_effnetv2
python image_to_256.py
python image_to_512.py
python train_stage2_unet.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
python predict_for_stage3_unet.py --base_model_name tu-efficientnetv2_rw_s --lambda_loss 25
python train_stage3_unet.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
python train_stage3_unet_resize.py --epochs 150 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 16
cd unetplusplus_effnetv2
python image_to_256.py
python image_to_512.py
python train_stage2.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
python predict_for_stage3.py --base_model_name tu-efficientnetv2_rw_s --lambda_loss 25
python train_stage3.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
python train_stage3_resize.py --epochs 150 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 16
python eval_stage3_all_unet.py --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
python eval_stage3_all.py --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64



