-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Feature: z-image Turbo Control Net #8679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: z-image Turbo Control Net #8679
Conversation
|
Merged the Z Image PR. Can you rebase this against main now so we can go through the checks? Thank you. |
feat: Add Z-Image ControlNet support with spatial conditioning Add comprehensive ControlNet support for Z-Image models including: Backend: - New ControlNet_Checkpoint_ZImage_Config for Z-Image control adapter models - Z-Image control key detection (_has_z_image_control_keys) to identify control layers - ZImageControlAdapter loader for standalone control models - ZImageControlTransformer2DModel combining base transformer with control layers - Memory-efficient model loading by building combined state dict
VRAM usage is high. - Auto-detect control_in_dim from adapter weights (16 for V1, 33 for V2.0) - Auto-detect n_refiner_layers from state dict - Add zero-padding for V2.0's additional channels - Use accelerate.init_empty_weights() for efficient model creation - Add ControlNet_Checkpoint_ZImage_Config to frontend schema
4f040f1 to
8db8aa8
Compare
- Add missing ControlNet_Checkpoint_ZImage_Config import - Remove unused imports (Any, Dict, ADALN_EMBED_DIM, is_torch_version) - Add strict=True to zip() calls - Replace mutable list defaults with immutable tuples - Replace dict() calls with literal syntax - Sort imports in z_image_denoise.py
Implement Z-Image ControlNet as an Extension pattern (similar to FLUX ControlNet) instead of merging control weights into the base transformer. This provides: - Lower memory usage (no weight duplication) - Flexibility to enable/disable control per step - Cleaner architecture with separate control adapter Key implementation details: - ZImageControlNetExtension: computes control hints per denoising step - z_image_forward_with_control: custom forward pass with hint injection - patchify_control_context: utility for control image patchification - ZImageControlAdapter: standalone adapter with control_layers and noise_refiner Architecture matches original VideoX-Fun implementation: - Hints computed ONCE using INITIAL unified state (before main layers) - Hints injected at every other main transformer layer (15 control blocks) - Control signal added after each designated layer's forward pass V2.0 ControlNet support (control_in_dim=33): - Channels 0-15: control image latents - Channels 16-31: reference image (zeros for pure control) - Channel 32: inpaint mask (1.0 = don't inpaint, use control signal)
|
So the v1 and v2 are really bad. the v2.1 works fine https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.1
|
Ah. Brand new. I'll check it out in a bit. Both the regular and the tile version. If they are go, we can set them as the suggested starter models and merge this one up too and move on to the regional guidance part. |
|
Tested out with the newer models. Definitely better performance. The quality of the controlnet models themselves is alright. LoRA functionality is much better but not Z Image base level yet. But this PR is good to go I think. ControlNet models seem to be working. Both tile and union. I synced up with main and fixed the ruff checks. If there's nothing else to add to this one, let me know. I can merge this and we can move on to the next one. Great job overall implementing Z Image. Looking great. |

Summary
Add support for Z-Image ControlNet V2.0 alongside the existing V1 support.
Key changes:
control_in_dimfrom adapter weights (16 for V1, 33 for V2.0)n_refiner_layersfrom state dictaccelerate.init_empty_weights()for more efficient model creationControlNet_Checkpoint_ZImage_Configto frontend schemaRelated Issues / Discussions
Part of Z-Image feature implementation.
QA Instructions
control_context_scale: 0.65-0.80Merge Plan
Can be merged after review. No special considerations needed.
Checklist
What's Newcopy (if doing a release after this PR)