Standalone CLI version of Qwen3-VL AutoTagger: generate Adobe Stock-style title + keywords and optionally embed XMP metadata into saved images.
Need the same tagging pipeline directly inside ComfyUI? Use the node project:
Use the production notebook with one-click Colab launch:
Colab_T4_CLI_Prod.ipynb: run tagging with XMP enabled by default and output download.
- Works without ComfyUI
- Processes a single image or a whole folder
- Saves tagged images with XMP metadata when
--write-xmpis enabled - Writes metadata JSONL report (
output_dir/metadata.jsonlby default) - Supports auto-download of
Qwen/Qwen3-VL-8B-Instructon first run - Optional 4-bit quantization on CUDA
- Clone repository:
git clone https://github.com/ekkonwork/qwen3-vl-autotagger-cli
cd qwen3-vl-autotagger-cli- Install dependencies:
pip install -r requirements.txt-
Install
exiftool(required only if--write-xmpis enabled):- Linux:
python install.py(usesapt-get) - macOS:
brew install exiftool - Windows:
choco install exiftool
- Linux:
-
Run:
python -m qwen3_vl_autotagger_cli.cli "C:/images" --recursive --output-dir "C:/images_tagged"qwen3-vl-autotagger INPUT_PATH [options]Main options:
--recursive: recursively scan folders--write-xmp / --no-write-xmp: enable/disable XMP embedding--require-exiftool / --no-require-exiftool: fail or skip whenexiftoolis missing--output-dir,--output-format,--file-prefix--metadata-jsonl: metadata JSONL output path--model-id: HF model id (defaultQwen/Qwen3-VL-8B-Instruct)--auto-download / --no-auto-download--local-model-path: local model folder for offline usage--load-in-4bit / --no-load-in-4bit--min-pixels,--max-pixels,--allow-resize / --no-allow-resize--max-keywords,--attempts,--temperature,--top-p,--repetition-penalty
Examples:
# Folder batch, write XMP + JSONL
python -m qwen3_vl_autotagger_cli.cli "./images" --recursive --output-dir "./outputs"
# Single image, metadata only (no saved output image)
python -m qwen3_vl_autotagger_cli.cli "./images/photo.jpg" --no-write-xmp --metadata-jsonl "./report.jsonl"
# Local model only (no download)
python -m qwen3_vl_autotagger_cli.cli "./images" --no-auto-download --local-model-path "D:/models/Qwen3-VL-8B-Instruct"- When
--write-xmpis enabled, CLI saves tagged images and embeds XMP metadata. - Saved filenames are auto-incremented (
file_prefix_00000,file_prefix_00001, ...) and existing files are not overwritten. - Metadata JSONL is written for each processed image (
input_path,output_path,title,keywords,json).
By default (--auto-download), the CLI downloads Qwen/Qwen3-VL-8B-Instruct on first run.
The download size is about 17.5 GB of weights in total (roughly 16.3 GiB).
On a Colab T4, a single image typically takes about 60 seconds to auto-tag (varies with resolution and settings).
exiftoolmay be missing or not available inPATH.- CUDA/driver/VRAM setup can differ between machines.
bitsandbytesmay fail to install or initialize on some systems.- If you see
Qwen3VLForConditionalGeneration is not available, reinstall dependencies:pip uninstall -y transformers qwen-vl-utilspip install -U git+https://github.com/huggingface/transformerspip install -U qwen-vl-utils accelerate bitsandbytes
- Hugging Face (
HF) downloads can be unstable due to network/rate limits.
If this project saves you time, you can support development on Boosty:
- Boosty (donate):
https://boosty.to/ekkonwork/donate
Scan this QR code in your wallet app to copy the donation address:
- TON:
UQAMPvqduXVWyax325-zqk81rTwNG1bRhCvXPyIs7eeIxEVp - USDT (TON):
UQAMPvqduXVWyax325-zqk81rTwNG1bRhCvXPyIs7eeIxEVp - Memo/Tag: check the Wallet receive screen before sending.
- English:
B2(text-first communication). - Hiring (full-time/long-term): prefer written communication; for live calls, Russian-speaking teams are preferred.
- Project work: open to worldwide async collaboration.
- Email:
ekkonwork@gmail.com - Telegram:
@Mikhail_ML_ComfyUI - LinkedIn:
https://www.linkedin.com/in/mikhail-kuznetsov-14304433b - Boosty:
https://boosty.to/ekkonwork/donate
MIT. See LICENSE.






