Skip to content

[Bug]: SVDQuant cache conflicts with multi-GPU #1198

@dxqb

Description

@dxqb

What happened?

svd quant cache is saved to cache-dir/quantization during model loading
but multi-GPU loads the model at the same time for all GPUs. can cause write conflicts

What did you expect would happen?

either:
a) is torch.svd_lowrank accurate enough for its current purpose? it wasn't originally, but now with layer filter it could be enough. Then no cache is necessary because svd_lowrank is fast
b) guard multi-GPU against quantizing on all GPUs at the same time

Relevant log output

Generate and upload debug_report.log

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions