Skip to content

QVAC-19998 feat(ltx): full-GPU LTX-2.3 video support (Metal)#13

Open
aegioscy wants to merge 7 commits into
2026-06-04from
2026-06-04-ltx
Open

QVAC-19998 feat(ltx): full-GPU LTX-2.3 video support (Metal)#13
aegioscy wants to merge 7 commits into
2026-06-04from
2026-06-04-ltx

Conversation

@aegioscy

@aegioscy aegioscy commented Jun 15, 2026

Copy link
Copy Markdown

Summary

Splits the LTX / custom-ggml delta out of the 2026-06-04 fork base.

2026-06-04 (base) = upstream 1f9ee88 + the 5 general, upstream-ggml-compatible patches: vcpkg port infra, Flux qkv assert, ESRGAN device API, upscaler defaults, Wan I2V VAE tiling bypass. Its ggml submodule stays at stock leejet/ggml 0ce7ad3.

This PR adds the 6 commits that depend on / switch to the unified qvac ggml:

  • feat: use fused Flux RoPE when available — requires GGML_OP_ROPE_FLUX (not in stock leejet ggml); bumps submodule to qvac-ext-ggml
  • chore: remove ggml git submodule — use vcpkg SD_USE_SYSTEM_GGML
  • fix: remove ggml-impl.h include — symbols are in public ggml.h
  • ggml_graph_cut: use public ggml_graph_leaf/n_leafs/add_leaf API
  • cli: default preferred_gpu_backend to GPU — fixes the LTX pipeline running on CPU
  • cmake: add /bigobj for MSVC — build fix for the larger LTX/system-ggml objects

LTX-2.3 model support itself comes from upstream; these are the qvac-side build/packaging reconciliation + the custom-ggml dependency.

Notes

  • Pure history reorganization: the 2026-06-04-ltx tip (df47d5e) is unchanged and byte-for-byte identical to the original fork tip.
  • The fused-RoPE submodule bump that previously leaked onto the base (pointing the leejet-URL submodule at qvac-ext-ggml@c40a0fc) now lives here instead.

Test plan

  • CI green on 2026-06-04-ltx
  • Build sd-cli (Release, Metal) against vcpkg system ggml
  • Smoke-test LTX-2.3 T2V generation

gianni-cor and others added 6 commits June 5, 2026 11:49
Port of 6ded9e6 onto 2026-06-04 base: use ggml_rope_flux fused op for
q/k in the Flux attention path (rope.hpp) and for v in the flash-attn
wrapper (ggml_extend.hpp) when the backend supports it, with a CPU
fallback to the existing permute/reshape path.

Bumps ggml submodule to aegioscy/qvac-ext-ggml@c40a0fc which provides
GGML_OP_ROPE_FLUX with CPU + Metal kernels.

Co-authored-by: Cursor <cursoragent@cursor.com>
ggml is now provided entirely by the vcpkg ggml port
(tetherto/qvac-ext-ggml). The submodule is not needed and was
only present for non-vcpkg local builds.

Co-authored-by: Cursor <cursoragent@cursor.com>
The ../ggml/src/ggml-impl.h path was a remnant of the git submodule.
All used symbols (GGML_MAX_DIMS, GGML_MAX_SRC, GGML_ASSERT,
ggml_graph_n_nodes, ggml_graph_node) are in the public ggml.h.

Co-authored-by: Cursor <cursoragent@cursor.com>
ggml_cgraph is now opaque (ggml-impl.h no longer vendored). Replace direct
member access with the public leaf accessors exported by qvac-ext-ggml
2026-06-06, fixing 'member access into incomplete type ggml_cgraph'.

Co-authored-by: Cursor <cursoragent@cursor.com>
SDContextParams::to_sd_ctx_params_t left preferred_gpu_backend out of the
sd_ctx_params_t aggregate initializer, so it zero-initialized to
SD_BACKEND_PREF_CPU and the whole LTX pipeline ran on CPU even on Metal
machines. Add SD_BACKEND_PREF_GPU explicitly; it is only honored when
--backend is unset, so existing --backend overrides are unaffected.

Co-authored-by: Cursor <cursoragent@cursor.com>
With the LTX-2 additions, the stable-diffusion.cpp translation unit exceeds
MSVC's COFF 2^16 section limit and fails to compile on Windows with fatal
error C1128. /bigobj raises the limit; clang/gcc are unaffected.

Co-authored-by: Cursor <cursoragent@cursor.com>
@aegioscy aegioscy changed the title feat(ltx): build against system/unified ggml (graph_cut public API) QVAC-19998 feat(ltx): build against system/unified ggml (graph_cut public API) Jun 15, 2026
@aegioscy aegioscy changed the title QVAC-19998 feat(ltx): build against system/unified ggml (graph_cut public API) QVAC-19998 feat(ltx): full-GPU LTX-2.3 video support (Metal) Jun 15, 2026
@dev-nid

dev-nid commented Jun 15, 2026

Copy link
Copy Markdown

The default build path looks broken on this branch. Verified on a clean checkout of df47d5e:

cmake .. with default flags (the flow docs/build.md documents) — exit 1:

CMake Error at CMakeLists.txt:312 (add_subdirectory):
add_subdirectory given source "ggml" which is not an existing directory.
-- Configuring incomplete, errors occurred!
The PR removes the ggml submodule, but SD_USE_SYSTEM_GGML still defaults to OFF and CMakeLists.txt:312 still does add_subdirectory(ggml). On 2026-06-04 the leejet submodule populates ggml/ so the default works; after this PR it doesn't. There are also no vcpkg port files in-tree on either branch, so I assume the qvac-ext-ggml port lives in a separate overlay.

Questions:

Is consuming this repo directly (without the vcpkg overlay) unsupported on this branch? If yes, can we either flip SD_USE_SYSTEM_GGML default to ON, or replace the add_subdirectory(ggml) fallback with a message(FATAL_ERROR ...) pointing users at -DSD_USE_SYSTEM_GGML=ON + the qvac-ext-ggml package?
Should docs/build.md be updated here? It still tells users to git clone --recursive / git submodule update, which no longer pulls a ggml tree.
add_definitions(-DGGML_MAX_NAME=128) is guarded by if (NOT SD_USE_SYSTEM_GGML), so it never applies on this branch. Is the qvac-ext-ggml package built with GGML_MAX_NAME=128? If not, long tensor names get silently truncated vs. before.

Addresses review feedback on PR #13. After removing the ggml submodule,
a plain `cmake ..` defaulted SD_USE_SYSTEM_GGML=OFF and hit
add_subdirectory(ggml) on a now-missing directory, failing with a
confusing CMake error.

- Default SD_USE_SYSTEM_GGML to ON (system/vcpkg ggml is the only
  supported path on this branch).
- Replace the add_subdirectory(ggml) fallback with explicit
  FATAL_ERROR messages pointing at the qvac-ext-ggml vcpkg port and the
  vcpkg toolchain file (both for missing system ggml and for an
  explicit -DSD_USE_SYSTEM_GGML=OFF with no submodule present).
- Update docs/build.md: drop the --recursive / git submodule update
  instructions and document the system/vcpkg ggml workflow. Note that
  the port exports GGML_MAX_NAME=128 as a PUBLIC compile definition so
  consumers inherit it automatically.

Co-authored-by: Cursor <cursoragent@cursor.com>
@aegioscy

Copy link
Copy Markdown
Author

Thanks @dev-nid — all four points addressed in 6a13b9c:

  1. Default build path. SD_USE_SYSTEM_GGML now defaults to ON. After the submodule removal, system/vcpkg ggml is the only supported path on this branch, so a plain cmake .. -DCMAKE_TOOLCHAIN_FILE=<vcpkg>/scripts/buildsystems/vcpkg.cmake now resolves ggml::ggml and configures cleanly.

  2. add_subdirectory(ggml) fallback. Replaced. If someone explicitly passes -DSD_USE_SYSTEM_GGML=OFF and there is no ggml/ submodule, CMake now fails with a clear FATAL_ERROR pointing at the qvac-ext-ggml vcpkg port + toolchain, instead of the confusing "source ggml is not an existing directory" error. The system path also emits a clear error if find_package(ggml) fails.

  3. Direct consumption / docs. Yes — consuming this repo directly is supported via the qvac-ext-ggml vcpkg port (not a vendored submodule). docs/build.md updated: dropped the --recursive / git submodule update instructions and documented the system/vcpkg ggml workflow.

  4. GGML_MAX_NAME=128 — no mismatch. The qvac-ext-ggml port builds with -DGGML_MAX_NAME=128 and exports it as a PUBLIC/INTERFACE compile definition on ggml::ggml-base (its ggml-config.cmake appends INTERFACE_COMPILE_DEFINITIONS GGML_MAX_NAME=128). So every consumer linking ggml::ggml inherits 128 automatically — including sd.cpp's own translation units, even though the in-tree add_definitions(-DGGML_MAX_NAME=128) is skipped under system ggml. That add_definitions now only applies to non-system builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants