-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hey all, new to all of this and can't get it running, here is what i have done so far:
Steps I have Done:
Since using an 5070 Ti I Downloaded ComfyUI Nightly from Comfy-Org/ComfyUI#6643 and extracted it
cloned the repository https://github.com/SanDiegoDude/ComfyUI-HiDream-Sampler/ to my ComfyUI\custom_nodes\ Directory via
git clone https://github.com/SanDiegoDude/ComfyUI-HiDream-Sampler.git
installed requirements using
.\python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-HiDream-Sampler\requirements.txt
tried to run a Prompt, got this Error:
ImportError: Loading a GPTQ quantized model requires gptqmodel (pip install gptqmodel) or auto-gptq (pip install auto-gptq) library.
installed gptqmodel using
.\python_embeded\python.exe -m pip install gptqmodel
tried to run a Prompt, got this Error:
!!! ERROR during execution: Cannot find a working triton installation. Either the package is not installed or it is too old. More information on installing Triton can be found at: https://github.com/triton-lang/triton
installed triton using
.\python_embeded\python.exe -m pip install -U --pre triton-windows
tried to run test_triton.py and it was succesful
tried to run a Prompt, but it just keeps crashing from here
Logfile show this:
g:\Downloads_AI\ComfyUI_windows_portable_nightly_pytorch>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build
Checkpoint files will always be loaded safely.
Total VRAM 16302 MB, total RAM 16323 MB
pytorch version: 2.8.0.dev20250321+cu128
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 5070 Ti : cudaMallocAsync
Using pytorch attention
ComfyUI version: 0.3.27
ComfyUI frontend version: 1.14.5
[Prompt Server] web root: g:\Downloads_AI\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\comfyui_frontend_package\static
Flash Attention 2 is not available, will use PyTorch's native attention if possible.
PyTorch SDPA (Scaled Dot Product Attention) is available.
GPTQModel (transformers) support is available (recommended).
auto_gptq not available.
GPTQ support composite: (GPTQModel: True | auto_gptq: False) -> True
Flash Attention is not available for HiDream, will use PyTorch's native attention.
PyTorch SDPA (Scaled Dot Product Attention) is available for HiDream.
GPTQ dependencies available - all models should work
HiDream: Successfully registered with ComfyUI memory management
HiDream Sampler Node Initialized
Available Models: ['full-nf4', 'dev-nf4', 'fast-nf4', 'full', 'dev', 'fast']
Import times for custom nodes:
0.0 seconds: G:\Downloads_AI\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\websocket_image_save.py
2.8 seconds: G:\Downloads_AI\ComfyUI_windows_portable_nightly_pytorch\ComfyUI\custom_nodes\ComfyUI-HiDream-Sampler
Starting server
To see the GUI go to: http://127.0.0.1:8188
got prompt
Using resolution: 1360×768 from aspect ratio: 16:9 (1360×768)
HiDream: Initial VRAM usage: 0.00 MB
Loading model for fast-nf4...
--- Loading Model Type: fast-nf4 ---
Model Path: azaneko/HiDream-I1-Fast-nf4
NF4: True, Requires BNB: False, Requires GPTQ deps: True
Using alternate LLM: False
(Start VRAM: 0.00 MB)
Cache check for key: fast-nf4_standard
Cache contains: []
[1a] Preparing LLM (GPTQ): hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
Setting max memory limit: 6GiB of 15.9GiB
Using device_map='auto'.
[1b] Loading Tokenizer: hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4...
Tokenizer loaded.
✅ Fixed rope_scaling to: {'type': 'linear', 'factor': 1.0}
[GPTQ] Using transformers.GPTQConfig for quantization
g:\Downloads_AI\ComfyUI_windows_portable_nightly_pytorch\python_embeded\Lib\site-packages\transformers\quantizers\auto.py:212: UserWarning: You passed quantization_config or equivalent parameters to from_pretrained but the model you're loading already has a quantization_config attribute. The quantization_config from the model will be used.However, loading attributes (e.g. ['backend', 'use_cuda_fp16', 'use_exllama', 'max_input_length', 'exllama_config', 'disable_exllama']) will be overwritten with the one you passed to from_pretrained. The rest will be ignored.
warnings.warn(warning_msg)
PyTorch version 2.8.0.dev20250321+cu128 available.
INFO ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
INFO Kernel: Auto-selection: adding candidate TritonV2QuantLinear
loss_type=None was set in the config but it is unrecognised.Using the default loss: ForCausalLMLoss.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:15<00:00, 7.91s/it]
Some weights of the model checkpoint at hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4 were not used when initializing LlamaForCausalLM: ['model.layers.0.mlp.down_proj.bias', 'model.layers.0.mlp.gate_proj.bias', 'model.layers.0.mlp.up_proj.bias', 'model.layers.0.self_attn.k_proj.bias', 'model.layers.0.self_attn.o_proj.bias', 'model.layers.0.self_attn.q_proj.bias', 'model.layers.0.self_attn.v_proj.bias', 'model.layers.1.mlp.down_proj.bias', 'model.layers.1.mlp.gate_proj.bias', 'model.layers.1.mlp.up_proj.bias', 'model.layers.1.self_attn.k_proj.bias', 'model.layers.1.self_attn.o_proj.bias', 'model.layers.1.self_attn.q_proj.bias', 'model.layers.1.self_attn.v_proj.bias', 'model.layers.10.mlp.down_proj.bias', 'model.layers.10.mlp.gate_proj.bias', 'model.layers.10.mlp.up_proj.bias', 'model.layers.10.self_attn.k_proj.bias', 'model.layers.10.self_attn.o_proj.bias', 'model.layers.10.self_attn.q_proj.bias', 'model.layers.10.self_attn.v_proj.bias', 'model.layers.11.mlp.down_proj.bias', 'model.layers.11.mlp.gate_proj.bias', 'model.layers.11.mlp.up_proj.bias', 'model.layers.11.self_attn.k_proj.bias', 'model.layers.11.self_attn.o_proj.bias', 'model.layers.11.self_attn.q_proj.bias', 'model.layers.11.self_attn.v_proj.bias', 'model.layers.12.mlp.down_proj.bias', 'model.layers.12.mlp.gate_proj.bias', 'model.layers.12.mlp.up_proj.bias', 'model.layers.12.self_attn.k_proj.bias', 'model.layers.12.self_attn.o_proj.bias', 'model.layers.12.self_attn.q_proj.bias', 'model.layers.12.self_attn.v_proj.bias', 'model.layers.13.mlp.down_proj.bias', 'model.layers.13.mlp.gate_proj.bias', 'model.layers.13.mlp.up_proj.bias', 'model.layers.13.self_attn.k_proj.bias', 'model.layers.13.self_attn.o_proj.bias', 'model.layers.13.self_attn.q_proj.bias', 'model.layers.13.self_attn.v_proj.bias', 'model.layers.14.mlp.down_proj.bias', 'model.layers.14.mlp.gate_proj.bias', 'model.layers.14.mlp.up_proj.bias', 'model.layers.14.self_attn.k_proj.bias', 'model.layers.14.self_attn.o_proj.bias', 'model.layers.14.self_attn.q_proj.bias', 'model.layers.14.self_attn.v_proj.bias', 'model.layers.15.mlp.down_proj.bias', 'model.layers.15.mlp.gate_proj.bias', 'model.layers.15.mlp.up_proj.bias', 'model.layers.15.self_attn.k_proj.bias', 'model.layers.15.self_attn.o_proj.bias', 'model.layers.15.self_attn.q_proj.bias', 'model.layers.15.self_attn.v_proj.bias', 'model.layers.16.mlp.down_proj.bias', 'model.layers.16.mlp.gate_proj.bias', 'model.layers.16.mlp.up_proj.bias', 'model.layers.16.self_attn.k_proj.bias', 'model.layers.16.self_attn.o_proj.bias', 'model.layers.16.self_attn.q_proj.bias', 'model.layers.16.self_attn.v_proj.bias', 'model.layers.17.mlp.down_proj.bias', 'model.layers.17.mlp.gate_proj.bias', 'model.layers.17.mlp.up_proj.bias', 'model.layers.17.self_attn.k_proj.bias', 'model.layers.17.self_attn.o_proj.bias', 'model.layers.17.self_attn.q_proj.bias', 'model.layers.17.self_attn.v_proj.bias', 'model.layers.18.mlp.down_proj.bias', 'model.layers.18.mlp.gate_proj.bias', 'model.layers.18.mlp.up_proj.bias', 'model.layers.18.self_attn.k_proj.bias', 'model.layers.18.self_attn.o_proj.bias', 'model.layers.18.self_attn.q_proj.bias', 'model.layers.18.self_attn.v_proj.bias', 'model.layers.19.mlp.down_proj.bias', 'model.layers.19.mlp.gate_proj.bias', 'model.layers.19.mlp.up_proj.bias', 'model.layers.19.self_attn.k_proj.bias', 'model.layers.19.self_attn.o_proj.bias', 'model.layers.19.self_attn.q_proj.bias', 'model.layers.19.self_attn.v_proj.bias', 'model.layers.2.mlp.down_proj.bias', 'model.layers.2.mlp.gate_proj.bias', 'model.layers.2.mlp.up_proj.bias', 'model.layers.2.self_attn.k_proj.bias', 'model.layers.2.self_attn.o_proj.bias', 'model.layers.2.self_attn.q_proj.bias', 'model.layers.2.self_attn.v_proj.bias', 'model.layers.20.mlp.down_proj.bias', 'model.layers.20.mlp.gate_proj.bias', 'model.layers.20.mlp.up_proj.bias', 'model.layers.20.self_attn.k_proj.bias', 'model.layers.20.self_attn.o_proj.bias', 'model.layers.20.self_attn.q_proj.bias', 'model.layers.20.self_attn.v_proj.bias', 'model.layers.21.mlp.down_proj.bias', 'model.layers.21.mlp.gate_proj.bias', 'model.layers.21.mlp.up_proj.bias', 'model.layers.21.self_attn.k_proj.bias', 'model.layers.21.self_attn.o_proj.bias', 'model.layers.21.self_attn.q_proj.bias', 'model.layers.21.self_attn.v_proj.bias', 'model.layers.22.mlp.down_proj.bias', 'model.layers.22.mlp.gate_proj.bias', 'model.layers.22.mlp.up_proj.bias', 'model.layers.22.self_attn.k_proj.bias', 'model.layers.22.self_attn.o_proj.bias', 'model.layers.22.self_attn.q_proj.bias', 'model.layers.22.self_attn.v_proj.bias', 'model.layers.23.mlp.down_proj.bias', 'model.layers.23.mlp.gate_proj.bias', 'model.layers.23.mlp.up_proj.bias', 'model.layers.23.self_attn.k_proj.bias', 'model.layers.23.self_attn.o_proj.bias', 'model.layers.23.self_attn.q_proj.bias', 'model.layers.23.self_attn.v_proj.bias', 'model.layers.24.mlp.down_proj.bias', 'model.layers.24.mlp.gate_proj.bias', 'model.layers.24.mlp.up_proj.bias', 'model.layers.24.self_attn.k_proj.bias', 'model.layers.24.self_attn.o_proj.bias', 'model.layers.24.self_attn.q_proj.bias', 'model.layers.24.self_attn.v_proj.bias', 'model.layers.25.mlp.down_proj.bias', 'model.layers.25.mlp.gate_proj.bias', 'model.layers.25.mlp.up_proj.bias', 'model.layers.25.self_attn.k_proj.bias', 'model.layers.25.self_attn.o_proj.bias', 'model.layers.25.self_attn.q_proj.bias', 'model.layers.25.self_attn.v_proj.bias', 'model.layers.26.mlp.down_proj.bias', 'model.layers.26.mlp.gate_proj.bias', 'model.layers.26.mlp.up_proj.bias', 'model.layers.26.self_attn.k_proj.bias', 'model.layers.26.self_attn.o_proj.bias', 'model.layers.26.self_attn.q_proj.bias', 'model.layers.26.self_attn.v_proj.bias', 'model.layers.27.mlp.down_proj.bias', 'model.layers.27.mlp.gate_proj.bias', 'model.layers.27.mlp.up_proj.bias', 'model.layers.27.self_attn.k_proj.bias', 'model.layers.27.self_attn.o_proj.bias', 'model.layers.27.self_attn.q_proj.bias', 'model.layers.27.self_attn.v_proj.bias', 'model.layers.28.mlp.down_proj.bias', 'model.layers.28.mlp.gate_proj.bias', 'model.layers.28.mlp.up_proj.bias', 'model.layers.28.self_attn.k_proj.bias', 'model.layers.28.self_attn.o_proj.bias', 'model.layers.28.self_attn.q_proj.bias', 'model.layers.28.self_attn.v_proj.bias', 'model.layers.29.mlp.down_proj.bias', 'model.layers.29.mlp.gate_proj.bias', 'model.layers.29.mlp.up_proj.bias', 'model.layers.29.self_attn.k_proj.bias', 'model.layers.29.self_attn.o_proj.bias', 'model.layers.29.self_attn.q_proj.bias', 'model.layers.29.self_attn.v_proj.bias', 'model.layers.3.mlp.down_proj.bias', 'model.layers.3.mlp.gate_proj.bias', 'model.layers.3.mlp.up_proj.bias', 'model.layers.3.self_attn.k_proj.bias', 'model.layers.3.self_attn.o_proj.bias', 'model.layers.3.self_attn.q_proj.bias', 'model.layers.3.self_attn.v_proj.bias', 'model.layers.30.mlp.down_proj.bias', 'model.layers.30.mlp.gate_proj.bias', 'model.layers.30.mlp.up_proj.bias', 'model.layers.30.self_attn.k_proj.bias', 'model.layers.30.self_attn.o_proj.bias', 'model.layers.30.self_attn.q_proj.bias', 'model.layers.30.self_attn.v_proj.bias', 'model.layers.31.mlp.down_proj.bias', 'model.layers.31.mlp.gate_proj.bias', 'model.layers.31.mlp.up_proj.bias', 'model.layers.31.self_attn.k_proj.bias', 'model.layers.31.self_attn.o_proj.bias', 'model.layers.31.self_attn.q_proj.bias', 'model.layers.31.self_attn.v_proj.bias', 'model.layers.4.mlp.down_proj.bias', 'model.layers.4.mlp.gate_proj.bias', 'model.layers.4.mlp.up_proj.bias', 'model.layers.4.self_attn.k_proj.bias', 'model.layers.4.self_attn.o_proj.bias', 'model.layers.4.self_attn.q_proj.bias', 'model.layers.4.self_attn.v_proj.bias', 'model.layers.5.mlp.down_proj.bias', 'model.layers.5.mlp.gate_proj.bias', 'model.layers.5.mlp.up_proj.bias', 'model.layers.5.self_attn.k_proj.bias', 'model.layers.5.self_attn.o_proj.bias', 'model.layers.5.self_attn.q_proj.bias', 'model.layers.5.self_attn.v_proj.bias', 'model.layers.6.mlp.down_proj.bias', 'model.layers.6.mlp.gate_proj.bias', 'model.layers.6.mlp.up_proj.bias', 'model.layers.6.self_attn.k_proj.bias', 'model.layers.6.self_attn.o_proj.bias', 'model.layers.6.self_attn.q_proj.bias', 'model.layers.6.self_attn.v_proj.bias', 'model.layers.7.mlp.down_proj.bias', 'model.layers.7.mlp.gate_proj.bias', 'model.layers.7.mlp.up_proj.bias', 'model.layers.7.self_attn.k_proj.bias', 'model.layers.7.self_attn.o_proj.bias', 'model.layers.7.self_attn.q_proj.bias', 'model.layers.7.self_attn.v_proj.bias', 'model.layers.8.mlp.down_proj.bias', 'model.layers.8.mlp.gate_proj.bias', 'model.layers.8.mlp.up_proj.bias', 'model.layers.8.self_attn.k_proj.bias', 'model.layers.8.self_attn.o_proj.bias', 'model.layers.8.self_attn.q_proj.bias', 'model.layers.8.self_attn.v_proj.bias', 'model.layers.9.mlp.down_proj.bias', 'model.layers.9.mlp.gate_proj.bias', 'model.layers.9.mlp.up_proj.bias', 'model.layers.9.self_attn.k_proj.bias', 'model.layers.9.self_attn.o_proj.bias', 'model.layers.9.self_attn.q_proj.bias', 'model.layers.9.self_attn.v_proj.bias']
- This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
INFO Format: Convertingcheckpoint_formatfromgptqto internalgptq_v2.
INFO Format: Converting GPTQ v1 to v2
INFO Format: Conversion complete: 0.06823968887329102s
INFO Optimize:TritonV2QuantLinearcompilation triggered.
✅ Text encoder loaded! (VRAM: 5467.26 MB)
[2] Preparing Transformer from: azaneko/HiDream-I1-Fast-nf4
Type: NF4
Loading Transformer... (May download files)
Moving Transformer to CUDA...
✅ Transformer loaded! (VRAM: 14646.96 MB)
[3] Preparing Scheduler: FlashFlowMatchEulerDiscreteScheduler
Using Scheduler: FlashFlowMatchEulerDiscreteScheduler
[4] Loading Pipeline from: azaneko/HiDream-I1-Fast-nf4
Passing pre-loaded components...
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:04<00:00, 2.00s/it]
Loading pipeline components...: 100%|██████████████████████████████████████████████████| 10/10 [00:04<00:00, 2.04it/s]
Pipeline structure loaded.
[5] Finalizing Pipeline...
Assigning transformer...
Moving pipeline object to CUDA (final check)...
Warning: Could not move pipeline object to CUDA: Allocation on device .
Attempting CPU offload for NF4...
g:\Downloads_AI\ComfyUI_windows_portable_nightly_pytorch>pause
I don't know how to proceed from here to troubleshoot further, any ideas are welcome!
Thanks