This project extends the Diffusion Illusions codebase with three new methods for generating visual cognitive illusions using Stable Diffusion.
- Python: 3.8-3.10 (recommended 3.10)
- GPU: NVIDIA GPU with >= 12GB VRAM
- CUDA: CUDA-compatible PyTorch
- Clone the repository:
git clone https://github.com/RyannDaGreat/Diffusion-Illusions.git
cd Diffusion-Illusions- Install dependencies:
pip install -r requirements.txt
pip install rp --upgrade- Configure Hugging Face (for model access):
huggingface-cli login
# Or visit https://huggingface.co/CompVis/stable-diffusion-v1-4 and accept the licenseWhat it does: Combines high-frequency details from one image with low-frequency structure from another, creating images that reveal different content at different viewing distances.
How to run:
python hybrid_camouflage.py
# Select mode 1 when prompted
# Enter prompts like: 'miku pyramids' or 'pencil_giraffe_head pencil_penguin'Output: Saved to outputs/hybrid_camouflage/
What it does: Hides an element (e.g., lion's head) into a background (e.g., cliff) through frequency domain manipulation, making elements naturally blend into the background.
How to run:
python hybrid_camouflage.py
# Select mode 2 when prompted
# Enter prompts like: 'volcano darth_vader' or 'pyramids gandalf'Output: Saved to outputs/hybrid_camouflage/
What it does: Creates a kaleidoscope effect where the foreground stays upright while the background rotates at different angles (0°, 45°, 90°, 135°), generating four visually continuous overlay images.
Key difference from Rotation Overlays:
- Rotation Overlays: Rotates the top layer (foreground), only 90° increments
- Kaleidoscope: Rotates the bottom layer (background), supports arbitrary angles (45°)
How to run:
# Option 1: Using Jupyter Notebook (recommended)
jupyter notebook kaleidoscope_extended_colab.ipynb
# Option 2: Run cells in sequence:
# 1. Install dependencies (cell 2-3)
# 2. Load Stable Diffusion (cell 6)
# 3. Configure prompts (cell 5)
# 4. Create images (cell 8-9)
# 5. Train (cell 13)Default prompts: 'miku froggo lipstick pyramids'
Output: Displayed in notebook, can be saved using save_run() function
Diffusion-Illusions/
├── source/ # Core source code
│ ├── stable_diffusion.py # Stable Diffusion wrapper
│ ├── learnable_textures.py # Learnable image representations (Fourier)
│ ├── stable_diffusion_labels.py # Text label processing
│ ├── bilateral_blur.py # Bilateral filtering
│ ├── clip.py # CLIP image-text similarity
│ ├── kaleidoscope_rotations.py # Rotation utilities for kaleidoscope
│ └── example_prompts.yaml # Example text prompts
│
├── hybrid_camouflage.py # Main script: Hybrid + Camouflage (Mode 1 & 2)
├── hidden_characters.py # Hidden characters script
│
├── kaleidoscope_extended_colab.ipynb # Kaleidoscope method (Jupyter)
├── rotation_overlays_for_colab.ipynb # Original rotation overlays
├── hidden_characters_for_colab.ipynb # Hidden characters notebook
│
├── outputs/ # Generated images (auto-created)
│ ├── hybrid_camouflage/ # Hybrid & Camouflage outputs
│ └── hidden_characters/ # Hidden characters outputs
│
├── requirements.txt # Python dependencies
└── README_EN.md # This file
| File | Purpose |
|---|---|
hybrid_camouflage.py |
Main script for Hybrid Images (mode 1) and Camouflage Images (mode 2) |
kaleidoscope_extended_colab.ipynb |
Kaleidoscope illusions (background rotation) |
hidden_characters.py |
Hidden characters generation script |
source/kaleidoscope_rotations.py |
Rotation utilities for arbitrary angles |
source/stable_diffusion.py |
Core Stable Diffusion wrapper |
source/learnable_textures.py |
Fourier feature networks for image representation |
- Image Size:
SIZE = 256(default, fast) or512(high quality, slower) - Iterations:
NUM_ITER = 5000-10000(more = better quality, longer time) - Learning Rate:
lr = 1e-4(default),1e-3(faster),1e-5(more stable)
Edit in kaleidoscope_extended_colab.ipynb:
- Rotation Angles:
[0°, 45°, 90°, 135°]→ Change to[0°, 30°, 60°, 90°]etc. - Brightness:
brightness = 3(2-5 range) - Guidance Scale:
guidance_scale = 60(50-100) - Prompt Weights:
weights = [1, 1, 1, 1]→ Adjust per-angle influence
outputs/hybrid_camouflage/
├── hybrid_result.png # Hybrid image (mode 1)
├── hybrid_high_freq.png # High-frequency source
├── hybrid_low_freq.png # Low-frequency source
├── camouflage_result.png # Camouflage image (mode 2)
├── camouflage_background.png # Background image
├── camouflage_element.png # Element image
└── *_progress_*.png # Training progress
- Images displayed in notebook during training
- Final result: 4 rotated overlay images + 2 base images (bottom, top)
- Use
save_run('name')to save timelapse tountracked/kaleidoscope_runs/
GPU not detected:
import torch
print(torch.cuda.is_available()) # Should be TrueOut of memory: Reduce SIZE to 256 or use smaller hidden_dim
Model download issues: Ensure Hugging Face is properly configured (see Installation step 3)
- Original project: Diffusion Illusions
- Hybrid Images: Oliva et al., "Hybrid images." ACM TOG 2006
- Camouflage Images: Chu et al., "Camouflage images." ACM TOG 2010
- Base paper: "Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors"
For questions or issues, refer to:
- Chinese README:
README_CN.md(more detailed documentation) - Original project website: diffusionillusions.com