Skip to content

[bnb] Moving a pipeline with a 8bit quantized model to CPU doesn't throw warningΒ #11352

@sayakpaul

Description

@sayakpaul

@SunMarc tests/quantization/bnb/test_mixed_int8.py::SlowBnb8bitTests::test_moving_to_cpu_throws_warning is failing in the diffusers main. It passes on v0.32.0-release branch.

My diffusers-cli env:

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- πŸ€— Diffusers version: 0.32.0
- Platform: Linux-5.4.0-166-generic-x86_64-with-glibc2.31
- Running on Google Colab?: No
- Python version: 3.10.12
- PyTorch version (GPU?): 2.6.0+cu124 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.30.2
- Transformers version: 4.51.3
- Accelerate version: 1.6.0
- PEFT version: 0.15.0
- Bitsandbytes version: 0.45.5
- Safetensors version: 0.4.4
- xFormers version: 0.0.28.post3
- Accelerator: NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
NVIDIA DGX Display, 4096 MiB
NVIDIA A100-SXM4-80GB, 81920 MiB
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

The main problem is module_is_sequentially_offloaded(module) prints True for the transformer component.

Could you take a look?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions