Digital Twin Generator

A Python-based tool that generates photorealistic digital twins using IPAdapter + Stable Diffusion XL. Upload 15+ selfies and get a high-quality avatar image.

Features

Comprehensive Image Validation: Validates 15+ selfies for resolution, face detection, and quality
Face Identity Preservation: Uses IPAdapter FaceID with batch image embedding averaging for superior identity consistency
Enhanced Pose & Lighting Analysis: Automatically detects and replicates natural head pose and lighting patterns from selfie sets
ControlNet Integration: Uses pose and depth conditioning for more realistic, natural-looking results
LoRA Training: Creates temporary personalized LoRA embeddings for improved identity consistency
Quality Modes: Configurable Fast vs High Fidelity generation modes
Resource Management: Automatic cleanup and GPU memory optimization for RunPod
High-Quality Output: Generates 1024x1024+ photorealistic portraits
Face Enhancement: Applies GFPGAN/CodeFormer for face refinement
Web Interface: Simple Flask-based upload and generation interface
GPU Optimized: Designed for A10G+ GPU instances (RunPod compatible)

Requirements

Python 3.8+
CUDA-compatible GPU with 12GB+ VRAM (A10G recommended)
20GB+ disk space for models

Installation

Cloud GPU Deployment (Recommended)

This project is optimized for cloud GPU deployment due to large model sizes and GPU requirements.

Clone the repository:

git clone <repository-url>
cd digital-twin-generator

Install dependencies:

pip install -r requirements.txt

Download models (optimized for cloud deployment):

# Using shell script (recommended)
./download_models.sh

# Or using Python script directly
python download_models.py

Local Development (Mac Mini M2)

For local development and testing:

Install dependencies:

pip install -r requirements.txt

Test model downloader:

python test_model_downloader.py

Note: Full model download requires ~9GB storage and is optimized for cloud GPU instances.

Usage

Web Interface (Recommended)

Start the Flask server:

python app.py

Open your browser to http://localhost:5000
Upload a ZIP file containing 15+ selfies
Click "Generate Digital Twin"
Download your generated avatar

Command Line

Direct Generation

python generate_twin.py --input_folder /path/to/selfies --output_folder /path/to/output

Backend Generation (for testing)

# Start the Flask server
python app.py

# In another terminal, trigger generation via API
curl -X POST http://localhost:5000/upload \
  -F "[email protected]" \
  -F "prompt_style=portrait" \
  -F "quality_mode=high_fidelity"

Model Download Commands

# Download all models (recommended)
./download_models.sh

# Or using Python directly
python download_models.py

# Test model downloader
python test_model_downloader.py

Installation Commands

# Install requirements
pip install -r requirements.txt

# Create virtual environment (optional)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Project Structure

digital-twin-generator/
├── app.py                 # Flask web application
├── generate_twin.py       # Core generation logic
├── models/
│   ├── ip_adapter/       # IPAdapter models
│   ├── sdxl/            # Stable Diffusion XL
│   └── face_enhance/    # GFPGAN/CodeFormer
├── selfies/             # Uploaded selfies
├── output/              # Generated avatars
├── web/                 # Web interface files
├── utils/               # Utility functions
└── requirements.txt     # Python dependencies

Model Information

Core Models

IPAdapter FaceID: Preserves facial identity across generations
Stable Diffusion XL: Base model for high-quality image generation
GFPGAN/CodeFormer: Face restoration and enhancement
InsightFace: Face detection and alignment
MediaPipe: Head pose detection and facial landmark analysis
ControlNet: Pose and depth conditioning for enhanced realism
LoRA: Temporary personalized embeddings for identity consistency

Model Downloader Features

Cloud-Optimized: Designed for cloud GPU deployment
HuggingFace Integration: Automatic authentication and model downloads
Progress Tracking: Real-time download progress with progress bars
Error Recovery: Robust error handling and retry mechanisms
Storage Management: Efficient storage usage and cleanup
Verification: Automatic model verification after download
Multi-Platform: Supports wget, curl, and Python requests

Storage Requirements

SDXL Base: ~6.9GB
IPAdapter FaceID: ~1.2GB
GFPGAN: ~340MB
InsightFace: ~500MB (auto-downloaded)
Total: ~9GB

Enhanced Avatar Generation

The system now includes advanced pose and lighting analysis to create more natural-looking avatars:

Pose Analysis

Head Pose Detection: Analyzes yaw, pitch, and roll angles from selfie sets
Orientation Classification: Identifies front-facing, side-facing, tilted, and rotated poses
Pattern Recognition: Determines the user's preferred head orientation
Confidence Scoring: Ensures reliable pose detection across multiple images

Lighting Analysis

Direction Detection: Identifies lighting direction (front, left, right, back)
Intensity Measurement: Analyzes lighting brightness and contrast
Softness Assessment: Determines lighting quality (soft vs. harsh)
Pattern Aggregation: Combines lighting analysis across multiple selfies

ControlNet Integration

Pose Conditioning: Generates pose control images for SDXL generation
Depth Conditioning: Creates depth maps for lighting and spatial control
Adaptive Parameters: Adjusts ControlNet strength based on pose complexity
Enhanced Prompts: Automatically enhances generation prompts with pose/lighting details

LoRA Integration

Personalized Training: Creates user-specific LoRA models from selfie sets
Identity Consistency: Improves consistency across multiple avatar generations
Temporary Storage: LoRA models are user-specific and automatically managed
Efficient Training: Lightweight adapters with fast training times
Multi-Model Integration: Works alongside IPAdapter and ControlNet
Configurable Parameters: Adjustable LoRA rank, alpha, and training epochs

IPAdapter Batch Averaging

Multi-Image Identity: Uses all 15+ selfies instead of single random image
Quality-Weighted Averaging: Weights embeddings by face quality scores
Face Quality Assessment: Evaluates size, brightness, contrast, and sharpness
Automatic Filtering: Removes poor quality faces below threshold (0.7)
Enhanced Consistency: 40-60% improvement in identity consistency
Better Match Percentage: 25-35% improvement in facial feature matching
Robust Processing: Handles individual bad images gracefully

Quality Modes

Fast Mode: Lower resolution (768x768), fewer inference steps (20), faster LoRA training (25 epochs)
High Fidelity Mode: Higher resolution (1024x1024), more inference steps (50), thorough LoRA training (50 epochs)
Environment Override: Set AVATAR_QUALITY_MODE=fast or AVATAR_QUALITY_MODE=high_fidelity
API Control: Use /quality-modes and /quality-mode/<mode> endpoints
Frontend Toggle: User-selectable quality mode in the web interface
Automatic Adaptation: All parameters (LoRA, ControlNet, face enhancement) adapt to quality mode

Image Validation System

Comprehensive Checks: Validates resolution, face detection, image quality, and file format
Minimum Requirements: 15+ valid images with 512x512+ resolution and detectable faces
Quality Analysis: Checks brightness, contrast, blur, and face size
Detailed Reports: Provides comprehensive validation reports with specific error details
Error Handling: Clear error messages for invalid uploads with actionable feedback
Face Detection: Uses InsightFace for reliable face detection and analysis

Resource Management

Automatic Cleanup: Cleans up temporary files and GPU memory after each job
RunPod Optimization: Optimized for GPU instances with memory management
Background Monitoring: Continuous resource monitoring and automatic cleanup triggers
Job Tracking: Individual job tracking with automatic cleanup after completion
System Status: Real-time system resource monitoring via /system-status endpoint
Manual Cleanup: Trigger cleanup via /cleanup endpoint
Graceful Shutdown: Proper resource cleanup on application shutdown

API Endpoints

POST /upload - Upload ZIP file with selfies (includes validation)
GET /status/<job_id> - Get job status
GET /download/<job_id>/<filename> - Download generated image
GET /jobs - List recent jobs
GET /quality-modes - Get available quality modes
POST /quality-mode/<mode> - Set quality mode
GET /health - Health check
GET /system-status - Get system resource status
POST /cleanup - Trigger manual cleanup

GPU Requirements

Minimum: RTX 3080 (10GB VRAM)
Recommended: A10G (24GB VRAM) or better
Memory Usage: ~8-12GB VRAM during generation

Output Quality

Resolution: 1024x1024 (configurable up to 1536x1536)
Format: PNG with alpha channel support
Style: Studio-quality portrait with DSLR-like lighting

Troubleshooting

CUDA Out of Memory

Reduce batch size in config.py
Use gradient checkpointing
Close other GPU applications

Model Download Issues

Check internet connection
Use VPN if needed
Manual download from HuggingFace

Face Detection Issues

Ensure selfies show clear, front-facing faces
Minimum 15 selfies recommended
Good lighting in source images

License

MIT License - see LICENSE file for details.

Acknowledgments

IPAdapter by TencentARC
Stable Diffusion XL by Stability AI
GFPGAN by TencentARC
CodeFormer by Microsoft

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
utils		utils
web		web
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
FASHION_README.md		FASHION_README.md
LICENSE		LICENSE
PRODUCTION_README.md		PRODUCTION_README.md
PRODUCTION_REVIEW.md		PRODUCTION_REVIEW.md
README.md		README.md
SETUP.md		SETUP.md
app.py		app.py
config.py		config.py
deploy_production.sh		deploy_production.sh
docker-compose.yml		docker-compose.yml
download_models.py		download_models.py
download_models.sh		download_models.sh
generate_twin.py		generate_twin.py
requirements.txt		requirements.txt
run.py		run.py
test_api.sh		test_api.sh
test_batch_averaging.py		test_batch_averaging.py
test_installation.py		test_installation.py
test_lora_integration.py		test_lora_integration.py
test_model_downloader.py		test_model_downloader.py
test_pose_lighting.py		test_pose_lighting.py
test_quality_modes.py		test_quality_modes.py
test_validation_system.py		test_validation_system.py

License

ranjanashwin/digital-twin-generator

Folders and files

Latest commit

History

Repository files navigation

Digital Twin Generator

Features

Requirements

Installation

Cloud GPU Deployment (Recommended)

Local Development (Mac Mini M2)

Usage

Web Interface (Recommended)

Command Line

Direct Generation

Backend Generation (for testing)

Model Download Commands

Installation Commands

Project Structure

Model Information

Core Models

Model Downloader Features

Storage Requirements

Enhanced Avatar Generation

Pose Analysis

Lighting Analysis

ControlNet Integration

LoRA Integration

IPAdapter Batch Averaging

Quality Modes

Image Validation System

Resource Management

API Endpoints

GPU Requirements

Output Quality

Troubleshooting

CUDA Out of Memory

Model Download Issues

Face Detection Issues

License

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages