Implemented Generative Models Documentation

This document provides an overview of all generative models with comprehensive documentation.

Documentation Structure

Each model documentation includes:

Overview & Motivation - Why this model matters
Theoretical Background - Core principles and theory
Mathematical Formulation - Loss functions, equations
High-Level Intuition - Conceptual understanding
Implementation Details - Architecture and config
Code Walkthrough - Key implementation sections
Optimization Tricks - Training improvements
Experiments & Results - Benchmarks and comparisons
Common Pitfalls - Issues and solutions
References - Original papers and resources

Completed Documentation

Main Categories

Main README (README.md) - Overview of all generative models
Diffusion Models (diffusion/README.md) - Comprehensive diffusion guide
GANs (gans.md) - All GAN variants
VAE (vae.md) - Variational autoencoders
Audio/Video (audio_video/README.md) - Temporal generation

Individual Model Documentation

Diffusion Models (diffusion/)

Audio/Video Models (audio_video/)

CogVideoX - Text-to-video generation
VideoPoet - LLM for video
VALL-E - Neural codec TTS
Voicebox - Flow-based speech
SoundStorm - Parallel audio generation
MusicGen - Text-to-music
NaturalSpeech 3 - Factorized diffusion TTS

Implementation Paths

All implementations are in Nexus/nexus/models/:

nexus/models/
├── diffusion/           # Diffusion model implementations
│   ├── base_diffusion.py
│   ├── conditional_diffusion.py
│   ├── stable_diffusion.py
│   ├── unet.py
│   ├── dit.py
│   ├── mmdit.py
│   ├── consistency_model.py
│   ├── flow_matching.py
│   ├── rectified_flow.py
│   └── pixart_alpha.py
├── video/               # Video generation
│   ├── cogvideox.py
│   └── videopoet.py
├── audio/               # Audio generation
│   ├── valle.py
│   ├── voicebox.py
│   ├── soundstorm.py
│   ├── musicgen.py
│   └── naturalspeech3.py
├── gan/                 # GAN models
│   ├── base_gan.py
│   ├── conditional_gan.py
│   ├── cycle_gan.py
│   └── wgan.py
└── cv/vae/              # VAE models
    └── vae.py

Quick Start Guide

For Diffusion Models

Start with Base Diffusion to understand DDPM
Learn Conditional Diffusion for control
Study Stable Diffusion for latent space
Explore Fast Sampling methods

For GANs

Read GANs documentation
Start with base GAN implementation
Try WGAN-GP for stable training
Experiment with conditional variants

For VAEs

Study VAE documentation
Understand beta-VAE trade-offs
Try different architectures (MLP vs Conv)
Experiment with disentanglement

For Audio/Video

Review Audio/Video README
Start with codec-based models (VALL-E)
Try diffusion-based (NaturalSpeech 3)
Explore video generation (CogVideoX)

Documentation Templates

For models without individual docs, refer to:

Category READMEs for comprehensive overviews
Implementation files for code details
Main README for general concepts

Each implementation file includes:

Detailed docstrings
Architecture descriptions
Key innovations explained
Usage examples in comments

Contributing

To add documentation for remaining models:

Follow the 10-section structure listed above
Include code examples from implementations
Add mathematical formulations where relevant
Reference original papers
Provide practical optimization tips
Document common pitfalls and solutions

References

See individual documentation files for complete reference lists.

Key resources:

Lil'Log Blog - Excellent overviews
Hugging Face Diffusion Course
Papers with Code - Implementations and benchmarks

Last updated: 2026-02-06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented Generative Models Documentation

Documentation Structure

Completed Documentation

Main Categories

Individual Model Documentation

Diffusion Models (diffusion/)

Audio/Video Models (audio_video/)

Implementation Paths

Quick Start Guide

For Diffusion Models

For GANs

For VAEs

For Audio/Video

Documentation Templates

Contributing

References

FilesExpand file tree

MODELS_IMPLEMENTED.md

Latest commit

History

MODELS_IMPLEMENTED.md

File metadata and controls

Implemented Generative Models Documentation

Documentation Structure

Completed Documentation

Main Categories

Individual Model Documentation

Diffusion Models (diffusion/)

Audio/Video Models (audio_video/)

Implementation Paths

Quick Start Guide

For Diffusion Models

For GANs

For VAEs

For Audio/Video

Documentation Templates

Contributing

References