Skip to content

RahulBhalley/cyclegan-qp

Repository files navigation

Artist Style Transfer Via Quadratic Potential (2025 Modernized)

Rahul Bhalley and Jianlin Su

arXiv paper

Abstract

In this paper we address the problem of artist style transfer where the painting style of a given artist is applied on a real world photograph. We train our neural networks in adversarial setting via recently introduced quadratic potential divergence for a stable learning process. To further improve the quality of generated artist stylized images we also integrate some of the recently introduced deep learning techniques in our method. To our best knowledge this is the first attempt towards artist style transfer via quadratic potential divergence. We provide some stylized image samples in the supplementary material. The source code for experimentation was written in PyTorch and is available online in my GitHub repository.

If you find our work or this repository helpful, please consider citing:

@article{bhalley2019artist,
  title={Artist Style Transfer Via Quadratic Potential},
  author={Bhalley, Rahul and Su, Jianlin},
  journal={arXiv preprint arXiv:1902.11108},
  year={2019}
}

🚀 2025 Modernization Features

The codebase has been upgraded with the following state-of-the-art GAN training techniques:

  • 🤗 Accelerate Integration: Seamlessly scale training across CPU, GPU (single/multi), and TPU with built-in support for mixed precision (FP16/BF16).
  • Differentiable Augmentation (DiffAugment): Dramatically improves data efficiency and prevents discriminator overfitting on small artist datasets.
  • Self-Attention (SAGAN): Integrated into both Generator and Critic to capture long-range spatial dependencies for more coherent global structures.
  • PyTorch 2.5+ Optimizations:
    • AMP (via Accelerate): Automatic Mixed Precision for faster training.
    • torch.compile: JIT compilation for optimized execution kernels.
    • Channels Last: Optimized memory format for NVIDIA Tensor Cores.
  • Architectural Upgrades:
    • GELU Activations: Smoother gradient flow than standard ReLU.
    • Residual Scaling: Improved stability for deep transformer blocks.
    • Instance Normalization: Standardized for high-quality style transfer.

Prerequisites

Install dependencies via:

pip install -r requirements.txt

Usage

  1. Clone & Setup:

    git clone https://github.com/rahulbhalley/cyclegan-qp.git
    cd cyclegan-qp
  2. Download Datasets:

    bash download_dataset.sh ukiyoe2photo
  3. Configure Accelerate (First time only):

    accelerate config
  4. Run:

    To train the network:

    accelerate launch train.py

    To perform inference (stylization):

    python infer.py

Configurations (config.py)

The project uses a structured Config dataclass for all hyperparameters.

Category Variable Description
Optimization MIXED_PRECISION Mixed precision mode ("no", "fp16", "bf16")
USE_COMPILE Enable torch.compile for speed
MATMUL_PRECISION Set to 'high' or 'medium' for Tensor Core boost
GAN Tricks USE_INSTANCE_NORM Use InstanceNorm instead of BatchNorm
UPSAMPLE Use NN-Upsample + Conv to avoid checkerboard artifacts
Data BATCH_SIZE Standard batch size (default: 4)
LOAD_DIM / CROP_DIM Image resizing and cropping dimensions
Losses LAMBDA Quadratic Potential penalty weight
CYC_WEIGHT Cycle-consistency weight (default: 10.0)
ID_WEIGHT Identity loss weight (default: 0.5)

🛠️ Code Acknowledgments

This refactored implementation incorporates high-quality modules from the open-source community:

  • DiffAugment Implementation: The differentiable augmentation logic in diff_augment.py is based on the official implementation by MIT HAN Lab: mit-han-lab/data-efficient-gans.
  • Self-Attention Implementation: The SelfAttention module in networks.py follows the architectural standards established in the PyTorch port by heykeetae: heykeetae/Self-Attention-GAN.
  • CycleGAN-QP Core: The foundational architecture and Quadratic Potential implementation are based on the original work by Rahul Bhalley: rahulbhalley/cyclegan-qp.

📚 References

This implementation integrates techniques from the following foundational papers:

  1. CycleGAN-QP (Ours): Bhalley, R., & Su, J. (2019). Artist Style Transfer Via Quadratic Potential. arXiv.
  2. DiffAugment: Zhao, S., et al. (2020). Differentiable Augmentation for Data-Efficient GAN Training. NeurIPS.
  3. Self-Attention GAN (SAGAN): Zhang, H., et al. (2019). Self-Attention Generative Adversarial Networks. ICML.
  4. GELU: Hendrycks, D., & Gimpel, K. (2016). Gaussian Error Linear Units (GELUs). arXiv.
  5. Deconvolution Checkerboard: Odena, A., et al. (2016). Deconvolution and Checkerboard Artifacts. Distill.

Results

Real Image to Stylized Image

Stylized Image to Real Image