You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry-local/concepts/foundry-local-architecture.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,7 +80,7 @@ The model cache stores downloaded AI models locally on your device, which ensure
80
80
Before models can be used with Foundry Local, they must be compiled and optimized in the [ONNX](https://onnx.ai) format. Microsoft provides a selection of published models in the Azure AI Foundry Model Catalog that are already optimized for Foundry Local. However, you aren't limited to those models - by using [Olive](https://microsoft.github.io/Olive/). Olive is a powerful framework for preparing AI models for efficient inference. It converts models into the ONNX format, optimizes their graph structure, and applies techniques like quantization to improve performance on local hardware.
81
81
82
82
> [!TIP]
83
-
> To learn more about compiling models for Foundry Local, read [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-huggingface-models.md).
83
+
> To learn more about compiling models for Foundry Local, read [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-hugging-face-models.md).
Copy file name to clipboardExpand all lines: articles/ai-foundry-local/how-to/how-to-compile-hugging-face-models.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
-
title: How to compile HuggingFace models to run on Foundry Local
2
+
title: How to compile Hugging Face models to run on Foundry Local
3
3
titleSuffix: Foundry Local
4
-
description: Learn how to compile and run HuggingFace models with Foundry Local.
4
+
description: Learn how to compile and run Hugging Face models with Foundry Local.
5
5
manager: scottpolly
6
6
ms.service: azure-ai-foundry
7
7
ms.custom: build-2025
@@ -11,7 +11,7 @@ ms.author: samkemp
11
11
author: samuel100
12
12
---
13
13
14
-
# How to compile HuggingFace models to run on Foundry Local
14
+
# How to compile Hugging Face models to run on Foundry Local
15
15
16
16
Foundry Local runs ONNX models on your device with high performance. While the model catalog offers _out-of-the-box_ precompiled options, you can use any model in the ONNX format.
17
17
@@ -21,7 +21,7 @@ This guide shows you how to:
21
21
22
22
> [!div class="checklist"]
23
23
>
24
-
> -**Convert and optimize** models from HuggingFace to run in Foundry Local. You'll use the `Llama-3.2-1B-Instruct` model as an example, but you can use any generative AI model from HuggingFace.
24
+
> -**Convert and optimize** models from Hugging Face to run in Foundry Local. You'll use the `Llama-3.2-1B-Instruct` model as an example, but you can use any generative AI model from Hugging Face.
25
25
> -**Run** your optimized models with Foundry Local
26
26
27
27
## Prerequisites
@@ -49,9 +49,9 @@ pip install olive-ai[auto-opt]
49
49
> [!TIP]
50
50
> For best results, install Olive in a virtual environment using [venv](https://docs.python.org/3/library/venv.html) or [conda](https://www.anaconda.com/docs/getting-started/miniconda/main).
51
51
52
-
## Sign in to HuggingFace
52
+
## Sign in to Hugging Face
53
53
54
-
You optimize the `Llama-3.2-1B-Instruct` model, which requires HuggingFace authentication:
54
+
You optimize the `Llama-3.2-1B-Instruct` model, which requires Hugging Face authentication:
55
55
56
56
### [Bash](#tab/Bash)
57
57
@@ -68,7 +68,7 @@ huggingface-cli login
68
68
---
69
69
70
70
> [!NOTE]
71
-
> You must first [create a HuggingFace token](https://huggingface.co/docs/hub/security-tokens) and [request model access](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) before proceeding.
71
+
> You must first [create a Hugging Face token](https://huggingface.co/docs/hub/security-tokens) and [request model access](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) before proceeding.
72
72
73
73
## Compile the model
74
74
@@ -113,15 +113,15 @@ The command uses the following parameters:
> If you have a local copy of the model, you can use a local path instead of the HuggingFace ID. For example, `--model_name_or_path models/llama-3.2-1B-Instruct`. Olive handles the conversion, optimization, and quantization automatically.
124
+
> If you have a local copy of the model, you can use a local path instead of the Hugging Face ID. For example, `--model_name_or_path models/llama-3.2-1B-Instruct`. Olive handles the conversion, optimization, and quantization automatically.
125
125
126
126
### Step 2: Rename the output model
127
127
@@ -159,10 +159,10 @@ Foundry Local requires a chat template JSON file called `inference_model.json` i
159
159
}
160
160
```
161
161
162
-
To create the chat template file, you can use the `apply_chat_template` method from the HuggingFace library:
162
+
To create the chat template file, you can use the `apply_chat_template` method from the Hugging Face library:
163
163
164
164
> [!NOTE]
165
-
> The following example uses the Python HuggingFace library to create a chat template. The HuggingFace library is a dependency for Olive, so if you're using the same Python virtual environment you don't need to install. If you're using a different environment, install the library with `pip install transformers`.
165
+
> The following example uses the Python Hugging Face library to create a chat template. The Hugging Face library is a dependency for Olive, so if you're using the same Python virtual environment you don't need to install. If you're using a different environment, install the library with `pip install transformers`.
0 commit comments