Skip to content

RuiyangJu/Efficient_Document_Image_Binarization

Repository files navigation

MFE-GAN

MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction

arXiv Project

Citation

If you find our paper useful in your research, please consider citing:

  • Conference Paper (APSIPA ASC 2025):
  @inproceedings{ju2025efficient,
    title={Efficient Generative Adversarial Networks for Color Document Image Enhancement and Binarization Using Multi-Scale Feature Extraction},
    author={Ju, Rui-Yang and Wong, KokSheik and Chiang, Jen-Shiun},
    booktitle={2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
    pages={1898--1903},
    year={2025},
    organization={IEEE}
  }
  • Journal Paper (Under Review):
  @article{ju2025mfegan,
    title={MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction},
    author={Ju, Rui-Yang and Wong, KokSheik and Jin, Yanlin and Chiang, Jen-Shiun},
    journal={arXiv preprint arXiv:2512.14114},
    year={2025}
  }

Method

Datasets

You can download the dataset used in this experiment from OneDrive.

Benchmark

Nabuco

  • Two-Fold Cross Validation: I. trainig set (15), test set (20); II. trainig set (20), test set (15).

CMATERdb

  • Five-Fold Cross Validation: trainig set (4), test set (1).

Result (DIBCO 2019-009)

Environment

  • NVIDIA GPU + CUDA CuDNN
  • Creat a new Conda environment:
  conda env create -f environment.yaml

Train

UNet & EfficientNetV2-S

  cd unet_effnetv2
  python image_to_256.py
  python image_to_512.py
  python train_stage2_unet.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
  python predict_for_stage3_unet.py --base_model_name tu-efficientnetv2_rw_s --lambda_loss 25
  python train_stage3_unet.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
  python train_stage3_unet_resize.py --epochs 150 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 16

UNet++ & EfficientNetV2-S

  cd unetplusplus_effnetv2
  python image_to_256.py
  python image_to_512.py
  python train_stage2.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
  python predict_for_stage3.py --base_model_name tu-efficientnetv2_rw_s --lambda_loss 25
  python train_stage3.py --epochs 10 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64
  python train_stage3_resize.py --epochs 150 --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 16

Test

UNet & EfficientNetV2-S

  python eval_stage3_all_unet.py --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64

UNet++ & EfficientNetV2-S

  python eval_stage3_all.py --lambda_loss 25 --base_model_name tu-efficientnetv2_rw_s --batch_size 64

Related Works

Expand