[Bug]: SVDQuant cache conflicts with multi-GPU

### What happened?

svd quant cache is saved to cache-dir/quantization during model loading
but multi-GPU loads the model at the same time for all GPUs. can cause write conflicts

### What did you expect would happen?

either:
a) is torch.svd_lowrank accurate enough for its current purpose? it wasn't originally, but now with layer filter it could be enough. Then no cache is necessary because svd_lowrank is fast
b) guard multi-GPU against quantizing on all GPUs at the same time

### Relevant log output

```shell

```

### Generate and upload debug_report.log

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: SVDQuant cache conflicts with multi-GPU #1198

What happened?

What did you expect would happen?

Relevant log output

Generate and upload debug_report.log

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: SVDQuant cache conflicts with multi-GPU #1198

Description

What happened?

What did you expect would happen?

Relevant log output

Generate and upload debug_report.log

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions