Skip to content

Conversation

@stevhliu
Copy link
Member

Splits off the Models section from Load schedulers and models and creates a dedicated section for models to include device placement, torch dtype, AutoModel API, and saving as shards.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@stevhliu stevhliu requested a review from sayakpaul August 28, 2025 22:21
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left some comments, LMK if they are unclear.

| `"cuda"` | places model or pipeline on CUDA device |
| `"balanced"` | evenly distributes model or pipeline on all GPUs |
| `"auto"` | distribute model from fastest device first to slowest |
| `"cuda"` | places pipeline on CUDA device |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"cuda" is just an example. If someone wants to do it for any other supported accelerator, I believe they pass it by their name 👀

Suggested change
| `"cuda"` | places pipeline on CUDA device |
| `"cuda"` | places pipeline on CUDA (or supported accelerator) device |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevhliu I don't think this was addressed.

@stevhliu stevhliu requested a review from sayakpaul September 4, 2025 21:07
@stevhliu stevhliu merged commit fc337d5 into huggingface:main Sep 5, 2025
1 check passed
@stevhliu stevhliu deleted the models branch September 5, 2025 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants