An End-to-End Character & Virtual Try-On Pipeline
Transform a handful of images into a consistent AI character, generate stunning photos, and virtually try on different clothing itemsβall through a unified, intuitive interface.
AI-Avtaar is a complete pipeline that seamlessly integrates three powerful AI engines into a single Streamlit application. The system enables you to:
- Train a custom character model (LoRA) from your photos
- Generate new images featuring your character
- Apply virtual clothing try-ons with photorealistic results
The pipeline orchestrates three specialized backend engines:
- LoRA Training Engine β Automated model training using Kohya_ss
- Image Generation Engine β AI image creation via Automatic1111 (A1111)
- Virtual Try-On Engine β Realistic clothing application with CatVTON
Each component runs in its own isolated virtual environment, ensuring clean dependency management and stability.
LoRA Training Interface |
Image Generation Interface |
Virtual Try-On Interface |
Results Dashboard |
|
|
|
| Casual Wear | Formal Attire | Outerwear Style |
| Feature | Description |
|---|---|
| Unified Interface | Single Streamlit app controls the entire pipeline |
| Zero-Config Training | No manual folder setup or parameter tuning required |
| Live Training Logs | Real-time monitoring with persistent log viewing |
| Seamless Workflow | Train β Generate β Try-On in one continuous flow |
| Multi-Environment | Isolated venvs prevent dependency conflicts |
AI-avtaar/
β
βββ π datasets/ # Trained LoRA models & prepared datasets
β
βββ π image-gen/ # Automatic1111 installation
β βββ a1111-venv/ # A1111 virtual environment
β βββ stable-diffusion-webui/ # A1111 repository
β
βββ π KOHYA_SS/ # Kohya_ss training engine
β βββ Kohya-venv/ # Kohya virtual environment
β βββ kohya_ss/ # Kohya repository
β βββ models/ # β οΈ Place base SDXL models here
β
βββ π LoRA-pipeline/ # Main Streamlit application
β βββ pipeline-venv/ # Streamlit virtual environment
β βββ pages/ # Application pages
β β βββ 1_Train_LoRA.py # Training interface
β β βββ 2_Image_Generation.py # Generation interface
β β βββ 3_Virtual_Try-On.py # Try-on interface
β βββ app.py # Homepage entry point
β βββ captioning.py # Image captioning utilities
β βββ dataset_preparation.py # Dataset preprocessing
β βββ training.py # Training launcher
β βββ SDXL_Preset.json # Training configuration
β
βββ π Regularization_images/ # Regularization source images
β
βββ π Vton/ # Virtual Try-On installation
βββ vton-venv/ # VTON virtual environment
βββ Virtual-TryOn/
βββ vto-backend/ # FastAPI backend service
- Orchestration: Streamlit, Python 3.10+
- Training: Kohya_ss, Accelerate, Diffusers, PyTorch
- Generation: Stable Diffusion WebUI (A1111)
- Try-On: CatVTON, FastAPI
- GPU: NVIDIA GPU with 16GB+ VRAM (recommended)
- Software: CUDA Toolkit, Python 3.10+, Git
- OS: Linux (recommended) or Windows with WSL
π¦ Note: This repository does not include sample datasets or the complete Vton folder due to size constraints. If you need these resources for testing or development, please contact me via email: itsnits333@gmail.com
git clone <your-repo-url>
cd AI-avtaarThe pipeline requires four separate virtual environments. Follow these steps carefully:
cd LoRA-pipeline
python3 -m venv pipeline-venv
source pipeline-venv/bin/activate # On Windows: pipeline-venv\Scripts\activate
pip install -r requirements.txt
deactivate
cd ..cd KOHYA_SS
python3 -m venv Kohya-venv
source Kohya-venv/bin/activate # On Windows: Kohya-venv\Scripts\activate
cd kohya_ss
./setup.sh # Follow Kohya_ss installation prompts
deactivate
cd ../..cd image-gen
python3 -m venv a1111-venv
source a1111-venv/bin/activate # On Windows: a1111-venv\Scripts\activate
cd stable-diffusion-webui
pip install -r requirements.txt
deactivate
cd ../..cd Vton
python3 -m venv vton-venv
source vton-venv/bin/activate # On Windows: vton-venv\Scripts\activate
cd Virtual-TryOn
pip install -r requirements.txt
deactivate
cd ../..
β οΈ CRITICAL STEP β The pipeline will not function without proper model placement.
The SDXL base model must be placed in two locations:
-
Place your base model (e.g.,
CyberRealisticXLPlay_V6.0.safetensors) in:AI-avtaar/KOHYA_SS/kohya_ss/models/ -
IMPORTANT: Create a converted
_diffusersversion in the same directory:# This conversion is required by training.py # Follow Kohya_ss documentation for conversion
Place the same .safetensors model in:
AI-avtaar/image-gen/stable-diffusion-webui/models/Stable-diffusion/
The pipeline requires three backend services running simultaneously before launching the main application.
cd AI-avtaar/image-gen/
source a1111-venv/bin/activate # On Windows: a1111-venv\Scripts\activate
cd stable-diffusion-webui/
python launch.py --listen --apiStatus: Server should start on http://localhost:7860
Action: Keep this terminal running
cd AI-avtaar/Vton/
source vton-venv/bin/activate # On Windows: vton-venv\Scripts\activate
cd Virtual-TryOn/vto-backend/
uvicorn backend_main:app --host 0.0.0.0 --reloadStatus: API should start on http://localhost:8000
Action: Keep this terminal running
cd AI-avtaar/LoRA-pipeline/
source pipeline-venv/bin/activate # On Windows: pipeline-venv\Scripts\activate
streamlit run app.pyStatus: Browser will automatically open to http://localhost:8501
Action: The application is now ready to use!
- Navigate to "1_Train_LoRA" page
- Enter a unique character name (e.g.,
my_character) - Upload 5-20 high-quality images of the person
- Click "Start Full Training Pipeline"
- Monitor live logs as the system:
- Generates captions for your images
- Prepares the training dataset
- Trains the LoRA model
Training Time: 30-60 minutes (depending on GPU)
- Navigate to "2_Image_Generation" page
- Click "Refresh" to load your newly trained LoRA
- Select your LoRA from the dropdown
- Enter the trigger word (e.g.,
my_character) - Write a creative prompt describing the desired scene
- Click "Generate Image"
Example Prompt:
my_character wearing a blue shirt, professional photo shoot,
studio lighting, high quality, detailed face
- Navigate to "3_Virtual_Try-On" page
- Upload your generated image as the Model Image
- Upload a clothing item (t-shirt, jacket, dress) as Clothing Image
- Select the appropriate clothing category
- Click "Generate Virtual Try-On"
Result: Photorealistic clothing application on your character
| Issue | Solution |
|---|---|
| "CUDA out of memory" | Reduce batch size in SDXL_Preset.json or use a GPU with more VRAM |
| "Model not found" | Verify base model is placed in both required locations |
| "API connection refused" | Ensure all three backend services are running |
| "Import errors" | Verify you're using the correct virtual environment for each component |
If you encounter issues:
- Check that all three backend services are running
- Verify all virtual environments are properly activated
- Ensure base models are correctly placed
- Review terminal logs for error messages
This project is licensed under the MIT License. See the LICENSE file for full details.
This pipeline integrates several open-source projects:
- Kohya_ss - LoRA training
- Automatic1111 - Image generation
- CatVTON - Virtual try-on
- Streamlit - Web interface
Built with β€οΈ for AI creators and developers
β Star this repo if you find it useful!






