G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering
trailer_small.mp4
🌐 Project Page | 📄 Paper | 🤗 Models
# Create conda environment
conda create -n framediffuser python=3.10 -y
conda activate framediffuser
# Install PyTorch with CUDA (required)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
# Install dependencies
pip install -r requirements.txtFrameDiffuser is an autoregressive neural rendering framework that generates temporally consistent, photorealistic frames by conditioning on G-buffer data and the model's own previous output. The dual-conditioning architecture combines ControlNet for structural guidance with ControlLoRA for temporal coherence.
- ControlNet: Processes 10-channel G-buffer input (BaseColor, Normals, Depth, Roughness, Metallic, Irradiance)
- ControlLoRA: Conditions on previous frame encoded in VAE latent space for temporal coherence
- Base Model: Stable Diffusion 1.5
Pretrained weights are available on HuggingFace.
| Model | Scene Type | Notes |
|---|---|---|
DowntownWest |
Outdoor | Recommended for outdoor scenes |
Hillside |
Indoor | Recommended for indoor scenes |
CityPark |
Outdoor | |
CitySample |
Outdoor | |
ElectricDreams |
Outdoor | Rainforest environment |
DerelictCorridor |
Indoor | Small environment with dark lighting |
Each model directory contains:
controlnet/- ControlNet weights (G-buffer encoder)controllora.safetensors- ControlLoRA weights (temporal conditioning)
For best results:
- Outdoor scenes: Use
DowntownWest - Indoor scenes: Use
Hillside
- Place your data in
data/train/anddata/validation/ - Edit
train_3_stages.batto set your prompt and paths - Run:
train_3_stages.batThe provided batch file serves as an example configuration. For best performance, experiment with adjusted settings for your specific environment and dataset.
python inference.pyTo add new models or datasets, use the GUI to select paths and save configurations.
G-buffer data can be exported from Unreal Engine using the Movie Render Queue with custom Post Process Materials.
-
Enable the Movie Render Queue plugin:
Edit > Plugins > Movie Render Queue(restart required) -
Create Post Process Materials for each G-buffer channel (BaseColor, Normals, Depth, Roughness, Metallic) that output the corresponding Scene Texture to Emissive Color
-
In the Movie Render Queue, add your Level Sequence and open Settings
-
Under Rendering > Deferred Rendering, expand Deferred Renderer Data
-
In Additional Post Process Materials, add array elements for each G-buffer material:
- Enable the element
- Set Name to the buffer type (e.g., "BaseColor", "Depth")
- Assign the corresponding Post Process Material
-
Add a .png Sequence output format under Exports
For more details, see the Cinematic Render Passes documentation.
Place your G-buffer renders in the following structure:
data/
├── train/
│ ├── FinalImage/
│ │ ├── FinalImage_0000.png
│ │ ├── FinalImage_0001.png
│ │ └── ...
│ ├── BaseColor/
│ │ ├── BaseColor_0000.png
│ │ └── ...
│ ├── Normals/
│ ├── Depth/
│ ├── Roughness/
│ └── Metallic/ (optional)
└── validation/
├── FinalImage/
├── BaseColor/
├── Normals/
├── Depth/
├── Roughness/
└── Metallic/ (optional)
Requirements:
- All buffers must have matching frame numbers
- Validation needs at least 2 frames (for previous frame conditioning)
- Supported formats: PNG, JPG
- Required: FinalImage, BaseColor, Normals, Depth, Roughness
- Optional: Metallic (creates black channel if missing)
If you find this work useful, please cite:
@article{beisswenger2025framediffuser,
title={FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering},
author={Beisswenger, Ole and Dihlmann, Jan-Niklas and Lensch, Hendrik},
journal={arXiv preprint arXiv:2512.16670},
year={2025}
}This project is licensed under the MIT License - see the LICENSE file for details.
This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation).
This project builds upon:
- control-lora-v3 by Wu Hecong (MIT License)
- Stable Diffusion by CompVis
- diffusers by Hugging Face