A professional PyQt6-based application for batch image interrogation and tagging using CLIP and Waifu Diffusion models with SQLite caching for efficiency.
- Multiple Model Support: Use CLIP or Waifu Diffusion (WD) Tagger models
- 12 WD Tagger Models: V1.4 (stable) and V3 (latest) variants including EVA-02, ViT, ConvNeXt, and Swin transformers
- 5 CLIP Models: Multiple CLIP architectures with optional caption models
- Intelligent Caching: SQLite database stores previous interrogations to avoid reprocessing
- Batch Processing: Interrogate entire directories efficiently with optional recursive subdirectory search
- Tag Management: Edit, save, and organize tags with an intuitive UI
- Advanced Image Inspection: Detailed dialog with multi-model comparison, WD sensitivity ratings, and tag filtering visualization
- Checkbox Tag Selector: Visual tag editor showing all model-generated tags with checkboxes for easy selection
- Tag Filtering System: Remove, replace, or force-include tags with customizable rules
- Smart File Organization: Organize images by tags with recursive search and directory selection to prevent re-organizing
- Confidence Scores: WD Tagger provides confidence scores for each tag
- Gallery View: Thumbnail-based image browser with visual indicators for tagged images
- GPU Acceleration: CUDA (NVIDIA) and ROCm (AMD) support for 10-50x faster processing
image_interrogator/
├── core/ # Core business logic
│ ├── database.py # SQLite database management
│ ├── hashing.py # Image hashing utilities
│ ├── file_manager.py # File I/O operations
│ └── base_interrogator.py # Abstract interrogator class
├── interrogators/ # Model implementations
│ ├── clip_interrogator.py # CLIP model
│ └── wd_interrogator.py # Waifu Diffusion Tagger
├── ui/ # PyQt6 UI components
│ ├── main_window.py # Main application window
│ ├── widgets.py # Custom widgets
│ ├── dialogs.py # Configuration dialogs
│ └── workers.py # Background worker threads
├── main.py # Application entry point
└── requirements.txt # Dependencies
- Python 3.10+
Optional:
- NVIDIA GPU: CUDA-capable GPU with drivers installed (recommended for optimal performance)
- AMD GPU (Linux only): ROCm-capable GPU with ROCm installed (see ROCm Support section)
The easiest way to set up the project is using the automated setup script:
Windows:
setup.batLinux/Mac:
chmod +x setup.sh
./setup.shThe setup script will:
- Check Python installation and version
- Create a virtual environment
- Detect your NVIDIA GPU and CUDA version automatically
- Install PyTorch with the correct CUDA support for your system
- Install all other dependencies
- Verify the installation
If you have an NVIDIA GPU, the script will automatically detect your CUDA version and ask for permission to install PyTorch with GPU support. Just answer 'y' when prompted!
If you prefer manual setup or the automated script fails:
Step 1: Create Virtual Environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activateStep 2: Install PyTorch
Choose the installation that matches your GPU:
Option A: NVIDIA GPU (CUDA)
# Recommended: CUDA 12.6 (RTX 30/40 series, GTX 16 series)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
# Or CUDA 13.0 (latest, RTX 40 series)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu130
# Or CUDA 12.8 (newer GPUs)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
# Or CUDA 11.8 (older GPUs: GTX 10 series, RTX 20 series)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118Option B: AMD GPU (ROCm) - Linux only
# ROCm 6.0 (RX 7000 series)
pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.0
# ROCm 5.7 (RX 6000 series)
pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.7Option C: CPU Only (no GPU)
pip install torch torchvisionStep 3: Install ONNX Runtime
For NVIDIA GPU:
pip install onnxruntime-gpu>=1.16.0For AMD GPU or CPU:
pip install onnxruntime>=1.16.0Step 4: Install Other Dependencies
pip install -r requirements.txtImportant Notes:
- PyTorch bundles CUDA runtime libraries - no separate CUDA Toolkit installation needed
- ONNX Runtime GPU package is CUDA-only (AMD users get CPU version)
- Check your NVIDIA driver supports the CUDA version you're installing
Quick Launch (Recommended):
The run scripts will perform a quick health check and alert you if GPU acceleration is not enabled:
Windows:
run.batLinux/Mac:
chmod +x run.sh
./run.shIf the script detects you have an NVIDIA GPU but PyTorch is running in CPU mode, it will warn you and suggest running the setup script to fix it.
Direct Launch:
python main.py- Select Directory: Click "Select Directory" to choose a folder containing images
- Optional: Check "Include subdirectories (recursive)" to process images in all subdirectories
- Configure Model:
- Choose model type (WD Tagger or CLIP)
- Click "Configure Model" to set parameters
- Click "Load Model" to load into memory
- Interrogate Images:
- Batch: Click "Start Batch" to process all images
- Single: Select an image and click "Interrogate Selected"
- Recursive mode will process images from all selected subdirectories
- Review Results: View tags in the results table with confidence scores (WD only)
- Edit Tags: Use the checkbox tag selector to enable/disable tags from all models
- Advanced Inspection: Double-click any image (Gallery or Queue) for detailed analysis
- Organize: Use "Organize by Tags" to move images into subdirectories
- Mode:
best: Highest quality, slowest (recommended for final tagging)fast: Quick processing, good quality (recommended for testing)classic: Traditional CLIP approachnegative: Generate negative prompts for Stable Diffusion
- Device:
cuda(GPU) orcpu
- Model Selection: Choose from 12 available models
- V1.4 models: moat, vit, convnext, swinv2 variants (stable)
- V3 models: vit, vit-large, convnext, swinv2, eva02-large (latest, recommended)
- See WD_MODELS.md for detailed model comparison
- Confidence Threshold: 0.0 - 1.0
- Lower (0.2-0.3): More tags, less precise
- Medium (0.35-0.5): Balanced (recommended: 0.35)
- Higher (0.5-0.8): Fewer tags, more confident
- Device:
cuda(GPU) orcpu
Recommended Models:
- Best Quality:
SmilingWolf/wd-eva02-large-tagger-v3 - Balanced:
SmilingWolf/wd-vit-tagger-v3orSmilingWolf/wd-v1-4-moat-tagger-v2 - Fastest:
SmilingWolf/wd-convnext-tagger-v3
The application uses SQLite to cache interrogation results:
- Images are hashed using SHA256
- Previous interrogations are retrieved from cache when available
- Different models maintain separate cached results
- Database file:
interrogations.dbin the application directory
Organize images into subdirectories based on tags with smart recursive search:
- Click "Organize by Tags"
- Review the warning: Operation will MOVE (not copy) files
- Enter tags to match (comma-separated)
- Specify target subdirectory name
- Choose match mode:
- any: Move if image has at least one matching tag
- all: Move only if image has all specified tags
- Optional - Recursive Search:
- Check "Include subdirectories (recursive)" to search all subdirectories
- Select which subdirectories to include as sources (prevents re-organizing already-organized files)
- Example: Uncheck "organized" folder to avoid moving previously organized images
- Optionally move .txt files with images
- Click "Move Images" and confirm the operation
Important Safety Features:
⚠️ Bold warning at top of dialog indicates this is a MOVE operation- Confirmation dialog shows all operation details before proceeding
- Default answer is "No" for safety
- Directory selection prevents accidentally reorganizing already-organized files
Example: Tags "landscape, sunset" with mode "any" will move all images tagged with either "landscape" OR "sunset" from selected source directories to the specified subdirectory.
Access detailed image analysis by:
- Double-clicking any image thumbnail in the Gallery tab
- Right-clicking an image and selecting "Advanced Inspection..."
- Double-clicking an item in the Interrogation queue
Features:
Model Results Tab:
- Switch between different interrogation results (CLIP/WD models)
- WD Sensitivity Ratings: Visual display of content ratings
- General, Sensitive, Questionable, Explicit (with confidence bars)
- Complete tag list with confidence scores
Database vs File Comparison Tab:
- Visual comparison of database tags vs .txt file tags
- Color-coded status indicators:
- 🟢 Green: Tag in both database and file
- 🟡 Yellow: Tag in database only (would be written with current filters)
- 🔴 Red: Tag filtered out by tag removal rules
- 🟠 Orange: Tag replaced by substitution rules
- 🔵 Blue: Tag in file only (manually added)
- Understand exactly what tag filters are doing
Tag Editor Tab:
- Checkbox Tag Selector: Visual interface showing ALL tags generated by ALL models
- Search/Filter: Quickly find specific tags with the search box
- Visual Selection: Checkboxes show which tags are currently in the .txt file
- ☑ Checked = Tag is saved in .txt file
- ☐ Unchecked = Tag was generated but not currently saved
- Tag Count Display: Shows "Total tags: X | Selected: Y"
- Quick Controls:
- Select All / Deselect All buttons
- Search filter for finding specific tags
- Save Tags to File: Green button applies checkbox selections to .txt file
- Saves bypass all filters - complete user control over which tags to keep
Navigation:
- Browse through images with Prev/Next buttons
- Arrow keys (←/→) for quick navigation
- ESC to close dialog
- Ctrl+S to save tags
The application supports recursive subdirectory search for both interrogation and organization:
During Interrogation:
- Check "Include subdirectories (recursive)" in the Directory section
- All subdirectories are automatically included in the image queue
- Queue displays relative paths (e.g., "subfolder/image.jpg") when recursive
- Status bar shows "(recursive)" to confirm mode
- Gallery tab automatically syncs with the same recursive setting
During Organization:
- Enable "Include subdirectories (recursive)" in the Organize dialog
- Select which subdirectories to use as sources:
- All subdirectories are shown with checkboxes
- Includes "(Root directory)" option for images in the main folder
- Default: All directories selected
- Smart Organization: Deselect already-organized folders to prevent re-organizing:
my_images/ ├── raw/ ← ☑ Include as source ├── processed/ ← ☑ Include as source └── organized/ ← ☐ Skip (already organized) - Only images from selected directories will be organized
Benefits:
- Process entire folder hierarchies in one operation
- Maintain organized structure by excluding certain directories
- Visual feedback shows which directories are being processed
- Prevents accidental re-organization of already-sorted files
Configure tag filters in the Interrogation tab to automatically clean up interrogation results:
Remove List:
- Blacklist unwanted tags (e.g., "letterboxed", "watermark")
- Tags are removed before writing to .txt files
- Database keeps all original tags
Replace Rules:
- Substitute tags with better alternatives
- Example: "1girl" → "solo female character"
- Applied before writing to .txt files
Keep List:
- Force-include specific tags even if below confidence threshold
- Useful for important but low-confidence tags
Important: Tag filters only affect .txt file output. The database always stores complete, unfiltered results for every model.
Main Window:
Ctrl+O: Select directoryCtrl+Q: Quit application
Advanced Inspection Dialog:
←/→: Navigate prev/next imageESC: Close dialogCtrl+S: Save tags
Tags are saved as comma-separated values in .txt files:
landscape, sunset, mountains, scenic, nature, beautiful sky
id: Primary keyfile_path: Current file pathfile_hash: SHA256 hash (unique identifier)width,height: Image dimensionsfile_size: File size in bytescreated_at,updated_at: Timestamps
id: Primary keymodel_name: Model identifiermodel_type: CLIP or WDversion: Model versionconfig: JSON configuration
id: Primary keyimage_id: Foreign key to imagesmodel_id: Foreign key to modelstags: JSON array of tagsconfidence_scores: JSON object of tag:score pairsraw_output: Raw model outputinterrogated_at: Timestamp
- GPU Usage: Always use CUDA (NVIDIA) or ROCm (AMD) if available for 10-50x speedup
- Batch Size: Process images in batches for optimal performance
- Cache Utilization: The database cache eliminates redundant processing
- Model Selection:
- WD Tagger is faster and provides confidence scores
- CLIP provides more natural language descriptions
The application provides full GPU acceleration for NVIDIA GPUs on Windows, Linux, and macOS (deprecated):
- NVIDIA GPU with CUDA Compute Capability 3.5 or higher
- NVIDIA drivers installed (CUDA Toolkit not required - PyTorch bundles CUDA runtime)
- Windows: GeForce/RTX/Quadro series
- Linux: GeForce/RTX/Quadro/Tesla series
- macOS: Legacy support only (Apple deprecated NVIDIA GPU support)
-
Install NVIDIA drivers (if not already installed):
- Windows: https://www.nvidia.com/download/index.aspx
- Linux: Use your distribution's package manager or NVIDIA's official installer
-
Run the setup script:
Windows:
setup.bat
Linux:
chmod +x setup.sh ./setup.sh
The script will:
- Automatically detect your NVIDIA GPU
- Detect CUDA version from
nvidia-smi - Install PyTorch with matching CUDA support (12.6, 12.8, 13.0, or 11.8)
- Install ONNX Runtime GPU for WD Tagger acceleration
✅ CLIP Models: Full CUDA acceleration via PyTorch ✅ WD Tagger Models: Full CUDA acceleration via ONNX Runtime GPU
- CLIP interrogation: 10-50x faster than CPU (depending on GPU)
- WD Tagger interrogation: 10-50x faster than CPU (depending on GPU)
- Batch processing: Optimal performance with GPU acceleration
The setup script automatically detects and installs the correct version:
- CUDA 13.0: Latest (RTX 40 series recommended)
- CUDA 12.8: Recent GPUs
- CUDA 12.6: Most common (RTX 30/40 series)
- CUDA 11.8: Older GPUs (GTX 10 series, RTX 20 series)
After running the setup script, you should see:
Windows:
PyTorch:
Version: 2.x.x+cu126
CUDA Available: True
GPU: NVIDIA GeForce RTX 4090
ONNX Runtime:
Version: 1.x.x
CUDA Provider: Yes
[SUCCESS] GPU acceleration is fully enabled!
- PyTorch: CUDA enabled (for CLIP models)
- ONNX Runtime: CUDA enabled (for WD Tagger models)
Linux:
PyTorch:
Version: 2.x.x+cu126
CUDA Available: True
GPU: NVIDIA GeForce RTX 4090
ONNX Runtime:
Version: 1.x.x
CUDA Provider: Yes
[SUCCESS] GPU acceleration is fully enabled!
- PyTorch: CUDA enabled (for CLIP models)
- ONNX Runtime: CUDA enabled (for WD Tagger models)
If you prefer to install manually or the setup script fails:
# Activate virtual environment
# Windows: venv\Scripts\activate
# Linux: source venv/bin/activate
# Install PyTorch with CUDA 12.6 (most common)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
# Install ONNX Runtime GPU
pip install onnxruntime-gpu>=1.16.0
# Install other dependencies
pip install -r requirements.txtFor other CUDA versions, change cu126 to:
cu130for CUDA 13.0cu128for CUDA 12.8cu118for CUDA 11.8
The application supports AMD GPUs on Linux through ROCm:
- Linux only (ROCm is not available on Windows/Mac)
- AMD GPU with ROCm support (RX 6000/7000 series, Radeon VII, Vega, etc.)
- ROCm 5.7 or later installed
-
Install ROCm (if not already installed):
# Ubuntu/Debian # Visit https://rocm.docs.amd.com/projects/install-on-linux/en/latest/
-
Run the setup script:
chmod +x setup.sh ./setup.sh
The script will:
- Automatically detect your AMD GPU
- Detect your ROCm version
- Install PyTorch with ROCm support
- Install ONNX Runtime (CPU version, see note below)
✅ CLIP Models: Full ROCm acceleration via PyTorch
- CLIP interrogation: Full GPU acceleration with ROCm (10-50x faster than CPU)
- WD Tagger interrogation: Runs on CPU (no pre-built ROCm package for ONNX Runtime)
- For ONNX Runtime ROCm support, you must build from source: https://onnxruntime.ai/docs/build/eps.html#migraphx
After running the setup script, you should see:
PyTorch:
Version: 2.x.x+rocmX.X
CUDA Available: True
GPU: AMD Radeon RX 7900 XTX
ONNX Runtime:
Version: 1.x.x
CUDA Provider: No
[SUCCESS] ROCm GPU acceleration is enabled!
- PyTorch: ROCm enabled (for CLIP models)
- ONNX Runtime: CPU mode (no ROCm pip package available)
The application supports ARM64 Linux systems with NVIDIA GPUs (e.g., Jetson Orin, Grace Hopper, Grace Blackwell):
✅ CLIP Models: Full CUDA acceleration via PyTorch
chmod +x setup.sh
./setup.shThe setup script will:
- Detect ARM64 architecture automatically
- Install PyTorch with CUDA support
- Install CPU-only ONNX Runtime (GPU wheels not available on PyPI for ARM64)
To enable GPU acceleration for WD Tagger on ARM64, you can build ONNX Runtime from source:
chmod +x build_onnx_arm64.sh
./build_onnx_arm64.shRequirements:
- CUDA Toolkit installed (
nvccin PATH) - cuDNN installed
- CMake 3.26+
- Build tools (gcc, g++, make)
- 10GB+ free disk space
- 30-60+ minutes build time
The build script will:
- Check all prerequisites
- Clone ONNX Runtime source
- Build with CUDA support for ARM64
- Install the wheel into your virtual environment
After setup (without building ONNX Runtime):
[SUCCESS] GPU acceleration is enabled!
- PyTorch: CUDA enabled (for CLIP models)
- ONNX Runtime: CPU mode (no ARM64 GPU wheels on PyPI)
After building ONNX Runtime:
[SUCCESS] GPU acceleration is fully enabled!
- PyTorch: CUDA enabled (for CLIP models)
- ONNX Runtime: CUDA enabled (for WD Tagger models)
| Feature | CUDA (NVIDIA x86) | CUDA (NVIDIA ARM64) | ROCm (AMD) | CPU Only |
|---|---|---|---|---|
| Platforms | Windows, Linux | Linux only | Linux only | All |
| CLIP Models | ✅ GPU (10-50x) | ✅ GPU (10-50x) | ✅ GPU (10-50x) | ❌ CPU (1x) |
| WD Tagger | ✅ GPU (10-50x) | ❌ CPU (1x)** | ❌ CPU (1x) | |
| Setup | Automatic | Automatic | Automatic | Automatic |
| Driver Required | NVIDIA | NVIDIA | ROCm | None |
| Best For | Maximum performance | ARM64 NVIDIA users | AMD Linux users | Testing/compatibility |
* WD Tagger on ARM64 can use GPU by running ./build_onnx_arm64.sh
** WD Tagger on ROCm requires building ONNX Runtime from source
For best performance:
- NVIDIA GPU (preferred): Full acceleration for both CLIP and WD Tagger
- AMD GPU on Linux: Full acceleration for CLIP, CPU for WD Tagger
- CPU only: Works but significantly slower (use for testing or compatibility)
Minimum recommended:
- NVIDIA: GTX 1060 6GB or better
- AMD: RX 6600 or better
- VRAM: 6GB minimum, 8GB+ recommended for larger models
If you see "CUDA not available (CPU mode will be used)" but you have an NVIDIA GPU:
# Windows
setup.bat# Linux/Mac
chmod +x setup.sh
./setup.shThe setup script will:
- Detect your GPU and CUDA version
- Offer to reinstall PyTorch with CUDA support
- Install ONNX Runtime GPU for WD Tagger acceleration
- Verify the installation
Check PyTorch version:
python -c "import torch; print(torch.__version__)"- ❌ Bad:
2.x.x+cpu(CPU-only) - ✅ Good:
2.x.x+cu126(CUDA 12.6)
Check CUDA availability:
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}'); print(f'GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')"Check ONNX Runtime:
python -c "import onnxruntime as ort; print(ort.get_available_providers())"- ✅ Should include:
CUDAExecutionProvider - ❌ Missing: Only shows
CPUExecutionProvider
- PyTorch CPU-only installed: Reinstall with CUDA support
- NVIDIA drivers missing/outdated:
- Windows: Download from https://www.nvidia.com/download/index.aspx
- Linux:
sudo apt install nvidia-driver-XXXor use your package manager
- Wrong CUDA version: Setup script will detect and fix
- ONNX Runtime CPU-only: Run
pip install onnxruntime-gpu --force-reinstall
nvidia-smiShould show your GPU and CUDA version. If this fails, install/update NVIDIA drivers.
If you have an AMD GPU on Linux but it's not being detected:
chmod +x setup.sh
./setup.shCheck if ROCm is installed:
rocm-smi
# or
rocminfoCheck PyTorch version:
python -c "import torch; print(torch.__version__)"- ✅ Good:
2.x.x+rocm6.0(ROCm 6.0) - ❌ Bad:
2.x.x+cpu(CPU-only)
Check GPU availability:
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"Note: ROCm uses the CUDA API in PyTorch, so cuda.is_available() returns True for AMD GPUs.
- ROCm not installed:
- Install from https://rocm.docs.amd.com/
- Ubuntu:
sudo apt install rocm-hip-runtime
- PyTorch CPU-only installed: Reinstall with ROCm support
- Unsupported GPU: Check ROCm compatibility at https://rocm.docs.amd.com/
- Wrong ROCm version: Setup script will detect and match
# Check installed ROCm version
cat /opt/rocm/.info/version
# Should match PyTorch ROCm version
python -c "import torch; print(torch.__version__)"- Ensure CUDA is properly configured (see above)
- Check that all dependencies are installed
- Try CPU mode if GPU fails (slower but works)
- Reduce batch size (process fewer images)
- Use CPU mode (slower but less memory)
- Close other applications
- Check GPU memory usage with
nvidia-smi
- Ensure images are in supported formats: .jpg, .jpeg, .png, .webp, .bmp, .gif
- Check file permissions
- Verify write permissions in the image directory
- Check that image paths don't contain special characters
"Python not found":
- Install Python 3.10+ from python.org
- Make sure Python is in your PATH
"Failed to install PyTorch":
- Check your internet connection
- Try running the script again
- For slow connections, the download may timeout (PyTorch is ~2GB)
Modify the model names in the interrogator classes:
# For WD Tagger
wd_interrogator = WDInterrogator("SmilingWolf/wd-v1-4-swinv2-tagger-v2")
# For CLIP
clip_interrogator = CLIPInterrogator("ViT-H-14/laion2b_s32b_b79k")from core import InterrogationDatabase, FileManager
from interrogators import WDInterrogator
# Initialize
db = InterrogationDatabase()
interrogator = WDInterrogator()
interrogator.load_model(threshold=0.35, device='cuda')
# Interrogate
results = interrogator.interrogate("path/to/image.jpg")
# Save to database
from core.hashing import hash_image_content, get_image_metadata
file_hash = hash_image_content("path/to/image.jpg")
metadata = get_image_metadata("path/to/image.jpg")
image_id = db.register_image(
"path/to/image.jpg",
file_hash,
metadata['width'],
metadata['height'],
metadata['file_size']
)
model_id = db.register_model(
interrogator.model_name,
interrogator.get_model_type()
)
db.save_interrogation(
image_id,
model_id,
results['tags'],
results['confidence_scores'],
results['raw_output']
)
# Write to file
from pathlib import Path
FileManager.write_tags_to_file(Path("path/to/image.jpg"), results['tags'])- CLIP Interrogator: Uses the clip-interrogator library
- Waifu Diffusion Tagger: Uses SmilingWolf's WD Tagger models
- Camie Tagger: Uses Camais03's Camie Tagger models
- UI Framework: PyQt6
For issues or questions regarding upstream modules, refer to the documentation of the underlying libraries: