Fix(model manager): Improve calculation of Z-Image VAE working memory needs #8740

lstein · 2026-01-05T03:56:56Z

Summary

This is a speculative PR that may address the OOM issues experienced on low-VRAM machines running ZImage models during the latent decode phase. It dyamically estimates the amount of VRAM needed by the currently selected VAE and increases the working_mem_bytes passed to the vae.model_on_device() context manager.

Related Issues / Discussions

See the Discord discussion that begins at https://discord.com/channels/1020123559063990373/1149506274971631688/1456756274858430611

QA Instructions

Either use a card with a low amount of VRAM (8-12 GB) or run an external process that uses a fair bit of VRAM in order to reduce the available VRAM to 8-12 GB. Enable use_partial_loading, but disable max_cache_vram and device_working_mem_gb.

Without this PR, run a generation with a combination of ZImage model and image dimensions that gives an OOM during the VAE decode phase.
Restart the server after pulling this PR and try the same generation.
Does it complete generation without OOM?

You may need to play with the image size to see an improvement.

Merge Plan

Simple merge

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

Co-authored-by: lstein <[email protected]>

Fix Z-Image VAE encode/decode to request working memory

19a9a99

Co-authored-by: lstein <[email protected]>

lstein requested review from JPPhoto, blessedcoolant and dunkeroni as code owners January 5, 2026 03:56

github-actions bot added python PRs that change python files invocations PRs that change invocations python-tests PRs that change python tests labels Jan 5, 2026

lstein marked this pull request as draft January 5, 2026 03:57

lstein added 4 commits January 5, 2026 15:58

fix: remove check for non-flux vae

598969b

fix: remove check for non-flux vae: latents_to_image

a7f7545

Remove conditional estimation tests

f710fba

Merge branch 'main' into bugfix/better-vram-management

23504ba

lstein marked this pull request as ready for review January 7, 2026 03:53

JPPhoto approved these changes Jan 7, 2026

View reviewed changes

Merge branch 'main' into bugfix/better-vram-management

5c71bc0

lstein enabled auto-merge (squash) January 8, 2026 17:43

lstein merged commit d34655f into invoke-ai:main Jan 8, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix(model manager): Improve calculation of Z-Image VAE working memory needs #8740

Fix(model manager): Improve calculation of Z-Image VAE working memory needs #8740

lstein commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix(model manager): Improve calculation of Z-Image VAE working memory needs #8740

Fix(model manager): Improve calculation of Z-Image VAE working memory needs #8740

Conversation

lstein commented Jan 5, 2026

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants