Skip to content
Discussion options

You must be logged in to vote

@matt23654

First I'm not sure where this came from but a lot of folks keep using -ot "^blk\.[3-9]\.ffn_.*_exps\.=CPU" which misses some other ffn layers without the exps as the naming convention on Qwen3 is a bit different than DeepSeek for example.

One other tip for multi-gpu is to recompile with -DGGML_SCHED_MAX_COPIES=1

Look here for more discussions and examples: https://huggingface.co/ubergarm/Qwen3-235B-A22B-GGUF/discussions/1#681642d4a383b2fb9aa3bd8c

Keep us posted how you get along, as some others have reported success with multi-gpu once they get the arguments just right for their specific systems!

Replies: 1 comment 6 replies

Comment options

You must be logged in to vote
6 replies
@matt23654
Comment options

@ubergarm
Comment options

Answer selected by matt23654
@matt23654
Comment options

@ikawrakow
Comment options

@matt23654
Comment options

@ubergarm
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants