Skip to content

Unable to deploy on SageMaker diffusers dropped device_type="auto" #139

@liogate

Description

@liogate

Based on the Python script below, we’re unable to deploy the model because diffusers ≥ 0.28 no longer accepts the device type "auto". Consequently, the GPU isn’t detected when the endpoint starts.

import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel

iam = boto3.client("iam")
role = iam.get_role(RoleName="my-ml-sagemaker-role")["Role"]["Arn"]

# Hub model configuration – see https://huggingface.co/models
hub = {
    "HF_MODEL_ID": "stable-diffusion-v1-5/stable-diffusion-v1-5",
    "HF_TASK": "text-to-image",
}

# Create a Hugging Face model
huggingface_model = HuggingFaceModel(
    transformers_version="4.49.0",
    pytorch_version="2.6.0",
    py_version="py312",
    env=hub,
    role=role,
)

# Deploy the model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1,          # number of instances
    instance_type="ml.g4dn.4xlarge",   # EC2 instance type
)

image_bytes = predictor.predict({"inputs": "Astronaut riding a horse"})

# Display the generated image with PIL
import io
from PIL import Image
image = Image.open(io.BytesIO(image_bytes))
aws sagemaker-runtime invoke-endpoint \
  --endpoint-name huggingface-pytorch-inference-2025-05-14-17-08-18-830 \
  --body "fileb://input_file.txt" output_file.txt

This returns:

An error occurred (ModelError) when calling the InvokeEndpoint operation:
Received client error (400) from primary with message:
{
  "code": 400,
  "type": "InternalServerException",
  "message": "auto not supported. Supported strategies are: balanced"
}

It looks like the SageMaker Hugging Face inference toolkit needs an update ? It use balanced only if there is more than 2 GPUs.

https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/diffusers_utils.py#L43

This issue might be related, so I'm pinning it : huggingface/diffusers#11555

Any guidance on how to resolve this SageMaker deployment issue would be greatly appreciated.
Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions