Unable to deploy on SageMaker diffusers dropped `device_type="auto"`

Based on the Python script below, we’re unable to deploy the model because `diffusers` ≥ 0.28 no longer accepts the device type **"auto"**. Consequently, the GPU isn’t detected when the endpoint starts.

```python
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel

iam = boto3.client("iam")
role = iam.get_role(RoleName="my-ml-sagemaker-role")["Role"]["Arn"]

# Hub model configuration – see https://huggingface.co/models
hub = {
    "HF_MODEL_ID": "stable-diffusion-v1-5/stable-diffusion-v1-5",
    "HF_TASK": "text-to-image",
}

# Create a Hugging Face model
huggingface_model = HuggingFaceModel(
    transformers_version="4.49.0",
    pytorch_version="2.6.0",
    py_version="py312",
    env=hub,
    role=role,
)

# Deploy the model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1,          # number of instances
    instance_type="ml.g4dn.4xlarge",   # EC2 instance type
)

image_bytes = predictor.predict({"inputs": "Astronaut riding a horse"})

# Display the generated image with PIL
import io
from PIL import Image
image = Image.open(io.BytesIO(image_bytes))
```

```bash
aws sagemaker-runtime invoke-endpoint \
  --endpoint-name huggingface-pytorch-inference-2025-05-14-17-08-18-830 \
  --body "fileb://input_file.txt" output_file.txt
```

This returns:

```
An error occurred (ModelError) when calling the InvokeEndpoint operation:
Received client error (400) from primary with message:
{
  "code": 400,
  "type": "InternalServerException",
  "message": "auto not supported. Supported strategies are: balanced"
}
```

It looks like the SageMaker Hugging Face inference toolkit needs an update ? It use `balanced` only if there is more than 2 GPUs.

https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/diffusers_utils.py#L43

This issue might be related, so I'm pinning it : https://github.com/huggingface/diffusers/issues/11555

Any guidance on how to resolve this SageMaker deployment issue would be greatly appreciated.
Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to deploy on SageMaker diffusers dropped `device_type="auto"` #139

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to deploy on SageMaker diffusers dropped device_type="auto" #139

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Unable to deploy on SageMaker diffusers dropped `device_type="auto"` #139