Skip to content

Commit 8903a8a

Browse files
committed
initial PR
1 parent 20c1e8e commit 8903a8a

File tree

10 files changed

+220
-150
lines changed

10 files changed

+220
-150
lines changed

docs/sagemaker/_toctree.yml

Lines changed: 27 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,28 @@
1-
- local: index
2-
title: Hugging Face on Amazon SageMaker
3-
- local: getting-started
4-
title: Get started
5-
- local: train
6-
title: Run training on Amazon SageMaker
7-
- local: inference
8-
title: Deploy models to Amazon SageMaker
9-
- local: reference
1+
- sections:
2+
- local: getting-started/index
3+
title: Hugging Face on AWS
4+
- local: getting-started/deploy
5+
title: Deploy Models on AWS
6+
- local: getting-started/train
7+
title: Train Models on AWS
8+
- local: getting-started/resources
9+
title: Other Resources
10+
title: Getting Started
11+
- sections:
12+
- local: dlcs/introduction
13+
title: Introduction
14+
- local: dlcs/features
15+
title: Features & benefits
16+
- local: dlcs/available
17+
title: Available DLCs on AWS
18+
title: Deep Learning Containers (DLCs)
19+
- sections:
20+
title: Examples
21+
- sections:
22+
title: Advanced Topics
23+
- sections:
24+
title: How-to
25+
- sections:
26+
- local: reference/inference-toolkit
27+
title: Inference Toolkit API
1028
title: Reference

docs/sagemaker/dlcs/available.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Available DLCs on AWS
2+
3+
Below you can find a listing of all the Deep Learning Containers (DLCs) available on AWS.
4+
5+
For each supported combination of use-case (training, inference), accelerator type (CPU, GPU, Neuron), and framework (PyTorch, TGI, TEI) containers are created.
6+
7+
## FAQ
8+
9+
**How to choose the right container for my use case?**
10+
11+
**How to find the URI of my container?**
12+
The URI is built with an AWS account ID and an AWS region. Those two values need to be replaced depending on your use case.
13+
Let's say you want to use the training DLC for GPUs in
14+
- `dlc-aws-account-id`: The AWS account ID of the account that owns the ECR repository. You can find them in the [here](https://github.com/aws/sagemaker-python-sdk/blob/e0b9d38e1e3b48647a02af23c4be54980e53dc61/src/sagemaker/image_uri_config/huggingface.json#L21)
15+
- `region`: The AWS region where you want to use it.
16+
17+
## Training
18+
19+
Pytorch Training DLC: For training, our DLCs are available for PyTorch via :hugging_face: Transformers. They include support for training on GPUs and AWS AI chips with libraries such as :hugging_face: TRL, Sentence Transformers, or :firecracker: Diffusers.
20+
21+
| Container URI | Accelerator |
22+
| -------------------------------------------------------------------------------------------------------------------------------- | ----------- |
23+
| 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-training:2.5.1-transformers4.49.0-gpu-py311-cu124-ubuntu22.04 | GPU |
24+
| 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-training-neuronx:2.1.2-transformers4.48.1-neuronx-py310-sdk2.20.0-ubuntu20.04 | Neuron |
25+
26+
27+
## Inference
28+
29+
### Pytorch Inference DLC
30+
31+
For inference, we have a general-purpose PyTorch inference DLC, for serving models trained with any of those frameworks mentioned before on CPU, GPU, and AWS AI chips.
32+
33+
| Container URI | Accelerator |
34+
| -------------------------------------------------------------------------------------------------------------------------------- | ----------- |
35+
| 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:2.6.0-transformers4.49.0-cpu-py312-ubuntu22.04- | CPU |
36+
| 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:2.6.0-transformers4.49.0-gpu-py312-cu124-ubuntu22.04 | GPU |
37+
| 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference-neuronx:2.1.2-transformers4.43.2-neuronx-py310-sdk2.20.0-ubuntu20.04 | Neuron |
38+
39+
### Text Generation Inference
40+
41+
There is also the Text Generation Inference (TGI) DLC for high-performance text generation of LLMs on GPU and AWS AI chips.
42+
43+
| Container URI | Accelerator |
44+
| -------------------------------------------------------------------------------------------------------------------------------- | ----------- |
45+
| 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.6.0-tgi3.2.3-gpu-py311-cu124-ubuntu22.04 | GPU |
46+
| 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.2-optimum0.0.28-neuronx-py310-ubuntu22.04 | Neuron |
47+
48+
### Text Embedding Inference
49+
50+
Finally, there is a Text Embeddings Inference (TEI) DLC for high-performance serving of embedding models on CPU and GPU.
51+
52+
| Container URI | Accelerator |
53+
| -------------------------------------------------------------------------------------------------------------------------------- | ----------- |
54+
| 683313688378.dkr.ecr.us-east-1.amazonaws.com/tei-cpu:2.0.1-tei1.2.3-cpu-py310-ubuntu22.04 | CPU |
55+
| 683313688378.dkr.ecr.us-east-1.amazonaws.com/tei:2.0.1-tei1.4.0-gpu-py310-cu122-ubuntu22.04 | GPU |

docs/sagemaker/dlcs/features.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Features & benefits
2+
3+
The Hugging Face DLCs provide ready-to-use, tested environments to train and deploy Hugging Face models.
4+
5+
## One command is all you need
6+
7+
With the new Hugging Face DLCs, train and deploy cutting-edge Transformers-based NLP models in a single line of code. The Hugging Face PyTorch DLCs for training come with all the libraries installed to run a single command e.g. via TRL CLI to fine-tune LLMs on any setting, either single-GPU, single-node multi-GPU, and more.
8+
9+
## Accelerate machine learning from science to production
10+
11+
In addition to Hugging Face DLCs, we created a first-class Hugging Face library for inference, huggingface-inference-toolkit, that comes with the Hugging Face PyTorch DLCs for inference, with full support on serving any PyTorch model on AWS.
12+
13+
Deploy your trained models for inference with just one more line of code or select any of the ever growing publicly available models from the model Hub.
14+
15+
## High-performance text generation and embedding
16+
17+
Besides the PyTorch-oriented DLCs, Hugging Face also provides high-performance inference for both text generation and embedding models via the Hugging Face DLCs for both Text Generation Inference (TGI) and Text Embeddings Inference (TEI), respectively.
18+
19+
The Hugging Face DLC for TGI enables you to deploy any of the +225,000 text generation inference supported models from the Hugging Face Hub, or any custom model as long as its architecture is supported within TGI.
20+
21+
The Hugging Face DLC for TEI enables you to deploy any of the +12,000 embedding, re-ranking or sequence classification supported models from the Hugging Face Hub, or any custom model as long as its architecture is supported within TEI.
22+
23+
Additionally, these DLCs come with full support for AWS meaning that deploying models from Amazon Simple Storage Service (S3) is also straight forward and requires no configuration.
24+
25+
## Built-in performance
26+
27+
Hugging Face DLCs feature built-in performance optimizations for PyTorch to train models faster. The DLCs also give you the flexibility to choose a training infrastructure that best aligns with the price/performance ratio for your workload.
28+
29+
Hugging Face Inference DLCs provide you with production-ready endpoints that scale quickly with your Google Cloud environment, built-in monitoring, and a ton of enterprise features.
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Introduction
2+
3+
Hugging Face built Deep Learning Containers (DLCs) for Amazon Web Services customers to run any of their machine learning workload in an optimized environment, with no configuration or maintenance on their part. These are Docker images pre-installed with deep learning frameworks and libraries such as 🤗 Transformers, 🤗 Datasets, and 🤗 Tokenizers. The DLCs allow you to directly serve and train any models, skipping the complicated process of building and optimizing your serving and training environments from scratch.
4+
5+
The containers are publicly maintained, updated and released periodically by Hugging Face and the AWS team and available for all AWS customers within the AWS’s Elastic Container Registry. They can be used from any AWS service such as:
6+
* Amazon Sagemaker AI: Amazon SageMaker AI is a fully managed machine learning (ML) platform for data scientists and developers to quickly and confidently build, train, and deploy ML models into a production-ready hosted environment.
7+
* Amazon Bedrock: Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI companies and Amazon available for your use through a unified API to build generative AI applications.
8+
* Amazon Elastic Kubernetes Service (EKS): Amazon EKS is the premiere platform for running Kubernetes clusters in the AWS cloud.
9+
* Amazon Elastic Container Service (ECS): Amazon ECS is a fully managed container orchestration service that helps you easily deploy, manage, and scale containerized applications.
10+
* Amazon Elastic Compute Cloud (EC2): Amazon EC2 provides on-demand, scalable computing capacity in the Amazon Web Services (AWS) Cloud.
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Deploy models on AWS
2+
3+
Deploying Hugging Face models on AWS is streamlined through various services, each suited for different deployment scenarios. Here's how you can deploy your models using AWS and Hugging Face offerings.
4+
5+
## With Sagemaker SDK
6+
7+
Amazon SageMaker is a fully managed AWS service for building, training, and deploying machine learning models at scale. The SageMaker SDK simplifies interacting with SageMaker programmatically. Amazon SageMaker SDK provides a seamless integration specifically designed for Hugging Face models, simplifying the deployment process of managed endpoints. With this integration, you can quickly deploy pre-trained Hugging Face models or your own fine-tuned models directly into SageMaker-managed endpoints, significantly reducing setup complexity and time to production.
8+
9+
To get started, check out this tutorial.
10+
11+
## With Sagemaker Jumpstart
12+
13+
Amazon SageMaker JumpStart is a curated model catalog from which you can deploy a model with just a few clicks. We maintain a Hugging Face section in the catalog that will let you self-host the most famous open models in your VPC with performant default configurations, powered under the hood by Hugging Face Deep Learning Catalogs (DLCs). (#todo link to DLC intro)
14+
15+
To get started, check out this tutorial.
16+
17+
## With AWS Bedrock
18+
19+
Amazon Bedrock enables developers to easily build and scale generative AI applications through a single API. With Bedrock Marketplace, you can now combine the ease of use of SageMaker JumpStart with the fully managed infrastructure of Amazon Bedrock, including compatibility with high-level APIs such as Agents, Knowledge Bases, Guardrails and Model Evaluations.
20+
21+
To get started, check out this [blogpost](https://huggingface.co/blog/bedrock-marketplace?).
22+
23+
## With Hugging Face Inference Endpoints
24+
25+
Hugging Face Inference Endpoints allow you to deploy models hosted directly by Hugging Face, fully managed and optimized for performance. It's ideal for quick deployment and scalable inference workloads.
26+
27+
[Get started with Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/main/en/index).
28+
29+
## With ECS, EKS, and EC2
30+
31+
Hugging Face provides Inference Deep Learning Containers (DLCs) to AWS users, optimized environments preconfigured with Hugging Face libraries for inference, natively integrated in SageMaker SDK and JumpStart. However, the HF DLCs can also be used across other AWS services like ECS, EKS, and EC2.
32+
33+
AWS Elastic Container Service (ECS), Elastic Kubernetes Service (EKS), and Elastic Compute Cloud (EC2) allow you to leverage DLCs directly.
34+
35+
Get started with HF DLCs on EC2.
36+
Get started with HF DLCs on ECS.
37+
Get started with HF DLCs on EKS.

docs/sagemaker/getting-started/index.md

Whitespace-only changes.
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Resources
2+
3+
Take a look at our published blog posts, videos, documentation, sample notebooks and scripts for additional help and more context about Hugging Face on AWS.
4+
5+
## Blogs and videos
6+
7+
- [AWS: Embracing natural language processing with Hugging Face](https://aws.amazon.com/de/blogs/opensource/embracing-natural-language-processing-with-hugging-face/)
8+
- [Deploy Hugging Face models easily with Amazon SageMaker](https://huggingface.co/blog/deploy-hugging-face-models-easily-with-amazon-sagemaker)
9+
- [AWS and Hugging Face collaborate to simplify and accelerate adoption of natural language processing models](https://aws.amazon.com/blogs/machine-learning/aws-and-hugging-face-collaborate-to-simplify-and-accelerate-adoption-of-natural-language-processing-models/)
10+
- [Walkthrough: End-to-End Text Classification](https://youtu.be/ok3hetb42gU)
11+
- [Working with Hugging Face models on Amazon SageMaker](https://youtu.be/leyrCgLAGjMn)
12+
- [Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker](https://huggingface.co/blog/sagemaker-distributed-training-seq2seq)
13+
- [Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker](https://youtu.be/pfBGgSGnYLs)
14+
- [Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker](https://youtu.be/l9QZuazbzWM)
15+
16+
## Documentation
17+
18+
- [Run training on Amazon SageMaker](/docs/sagemaker/train)
19+
- [Deploy models to Amazon SageMaker](/docs/sagemaker/inference)
20+
- [Reference](/docs/sagemaker/reference)
21+
- [Amazon SageMaker documentation for Hugging Face](https://docs.aws.amazon.com/sagemaker/latest/dg/hugging-face.html)
22+
- [Python SDK SageMaker documentation for Hugging Face](https://sagemaker.readthedocs.io/en/stable/frameworks/huggingface/index.html)
23+
- [Deep Learning Container](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-training-containers)
24+
- [SageMaker's Distributed Data Parallel Library](https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel.html)
25+
- [SageMaker's Distributed Model Parallel Library](https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel.html)
26+
27+
## Sample workshops
28+
29+
## Sample notebooks
30+
31+
- [All notebooks](https://github.com/huggingface/notebooks/tree/master/sagemaker)
32+
- [Getting Started with Pytorch](https://github.com/huggingface/notebooks/blob/main/sagemaker/01_getting_started_pytorch/sagemaker-notebook.ipynb)
33+
- [Getting Started with Tensorflow](https://github.com/huggingface/notebooks/blob/main/sagemaker/02_getting_started_tensorflow/sagemaker-notebook.ipynb)
34+
- [Distributed Training Data Parallelism](https://github.com/huggingface/notebooks/blob/main/sagemaker/03_distributed_training_data_parallelism/sagemaker-notebook.ipynb)
35+
- [Distributed Training Model Parallelism](https://github.com/huggingface/notebooks/blob/main/sagemaker/04_distributed_training_model_parallelism/sagemaker-notebook.ipynb)
36+
- [Spot Instances and continue training](https://github.com/huggingface/notebooks/blob/main/sagemaker/05_spot_instances/sagemaker-notebook.ipynb)
37+
- [SageMaker Metrics](https://github.com/huggingface/notebooks/blob/main/sagemaker/06_sagemaker_metrics/sagemaker-notebook.ipynb)
38+
- [Distributed Training Data Parallelism Tensorflow](https://github.com/huggingface/notebooks/blob/main/sagemaker/07_tensorflow_distributed_training_data_parallelism/sagemaker-notebook.ipynb)
39+
- [Distributed Training Summarization](https://github.com/huggingface/notebooks/blob/main/sagemaker/08_distributed_summarization_bart_t5/sagemaker-notebook.ipynb)
40+
- [Image Classification with Vision Transformer](https://github.com/huggingface/notebooks/blob/main/sagemaker/09_image_classification_vision_transformer/sagemaker-notebook.ipynb)
41+
- [Deploy one of the 10 000+ Hugging Face Transformers to Amazon SageMaker for Inference](https://github.com/huggingface/notebooks/blob/main/sagemaker/11_deploy_model_from_hf_hub/deploy_transformer_model_from_hf_hub.ipynb)
42+
- [Deploy a Hugging Face Transformer model from S3 to SageMaker for inference](https://github.com/huggingface/notebooks/blob/main/sagemaker/10_deploy_model_from_s3/deploy_transformer_model_from_s3.ipynb)
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Train models on AWS
2+
3+
Training Hugging Face models on AWS is streamlined through various services. Here's how you can fine-tune your models using AWS and Hugging Face offerings.
4+
5+
## With Sagemaker SDK
6+
7+
Amazon SageMaker is a fully managed AWS service for building, training, and deploying machine learning models at scale. The SageMaker SDK simplifies interacting with SageMaker programmatically. Amazon SageMaker SDK provides a seamless integration specifically designed for Hugging Face models, simplifying the training job management. With this integration, you can quickly create your own fine-tuned models, significantly reducing setup complexity and time to production.
8+
9+
To get started, check out this example.
10+
11+
## With ECS, EKS, and EC2
12+
13+
Hugging Face provides Training Deep Learning Containers (DLCs) to AWS users, optimized environments preconfigured with Hugging Face libraries for training, natively integrated in SageMaker SDK. However, the HF DLCs can also be used across other AWS services like ECS, EKS, and EC2.
14+
15+
AWS Elastic Container Service (ECS), Elastic Kubernetes Service (EKS), and Elastic Compute Cloud (EC2) allow you to leverage DLCs directly.
16+
17+
Get started with HF DLCs on EC2
18+
Get started with HF DLCs on ECS
19+
Get started with HF DLCs on EKS

docs/sagemaker/how-to/get-started-sagemaker-sdk.md

Whitespace-only changes.

0 commit comments

Comments
 (0)