Merge pull request #13 from samuel100/fixing_merge_errors

MaanavD · web-flow · commit 0ddb5b7734c0 · 2025-05-09T18:50:19.000-04:00
Fixed suggestions.
diff --git a/articles/ai-foundry-local/concepts/foundry-local-architecture.md b/articles/ai-foundry-local/concepts/foundry-local-architecture.md
@@ -31,7 +31,7 @@ Key benefits of Foundry Local include:
 
 The Foundry Local architecture consists of these main components:
 
-:::image type="content" source="../media/architecture/foundry-local-arch.png" alt-text="Diagram of Foundry Local Architecture":::
+:::image type="content" source="../media/architecture/foundry-local-arch.png" alt-text="Diagram of Foundry Local Architecture.":::
 
 ### Foundry Local service
 
diff --git a/articles/ai-foundry-local/how-to/how-to-compile-huggingface-models.md b/articles/ai-foundry-local/how-to/how-to-compile-huggingface-models.md
@@ -1,7 +1,7 @@
 ---
-title: How to compile Hugging Face models to run on Foundry Local
+title: How to compile HuggingFace models to run on Foundry Local
 titleSuffix: Foundry Local
-description: Learn how to compile and run Hugging Face models with Foundry Local.
+description: Learn how to compile and run HuggingFace models with Foundry Local.
 manager: scottpolly
 ms.service: azure-ai-foundry
 ms.custom: build-2025
@@ -11,17 +11,17 @@ ms.author: samkemp
 author: samuel100
 ---
 
-# How to compile Hugging Face models to run on Foundry Local
+# How to compile HuggingFace models to run on Foundry Local
 
-Foundry Local runs ONNX models on your device with high performance. While the model catalog offers *out-of-the-box* precompiled options, you can use any model in the ONNX format. 
+Foundry Local runs ONNX models on your device with high performance. While the model catalog offers _out-of-the-box_ precompiled options, you can use any model in the ONNX format.
 
-To compile existing models in Safetensor or PyTorch format into the ONNX format, you can use [Olive](https://microsoft.github.io/Olive). Olive is a tool that optimizes models to ONNX format, making them suitable for deployment in Foundry Local. It uses techniques like *quantization* and *graph optimization* to improve performance.
+To compile existing models in Safetensor or PyTorch format into the ONNX format, you can use [Olive](https://microsoft.github.io/Olive). Olive is a tool that optimizes models to ONNX format, making them suitable for deployment in Foundry Local. It uses techniques like _quantization_ and _graph optimization_ to improve performance.
 
 This guide shows you how to:
 
 > [!div class="checklist"]
 >
-> - **Convert and optimize** models from Hugging Face to run in Foundry Local. You'll use the `Llama-3.2-1B-Instruct` model as an example, but you can use any generative AI model from Hugging Face.
+> - **Convert and optimize** models from HuggingFace to run in Foundry Local. You'll use the `Llama-3.2-1B-Instruct` model as an example, but you can use any generative AI model from HuggingFace.
 > - **Run** your optimized models with Foundry Local
 
 ## Prerequisites
@@ -49,9 +49,9 @@ pip install olive-ai[auto-opt]
 > [!TIP]
 > For best results, install Olive in a virtual environment using [venv](https://docs.python.org/3/library/venv.html) or [conda](https://www.anaconda.com/docs/getting-started/miniconda/main).
 
-## Sign in to Hugging Face
+## Sign in to HuggingFace
 
-You optimize the `Llama-3.2-1B-Instruct` model, which requires Hugging Face authentication:
+You optimize the `Llama-3.2-1B-Instruct` model, which requires HuggingFace authentication:
 
 ### [Bash](#tab/Bash)
 
@@ -68,15 +68,14 @@ huggingface-cli login
 ---
 
 > [!NOTE]
-> You must first [create a Hugging Face token](https://huggingface.co/docs/hub/security-tokens) and [request model access](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) before proceeding.
+> You must first [create a HuggingFace token](https://huggingface.co/docs/hub/security-tokens) and [request model access](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) before proceeding.
 
 ## Compile the model
 
 ### Step 1: Run the Olive auto-opt command
 
 Use the Olive `auto-opt` command to download, convert, quantize, and optimize the model:
 
-
 ### [Bash](#tab/Bash)
 
 ```bash
@@ -112,18 +111,17 @@ olive auto-opt `
 
 The command uses the following parameters:
 
-| Parameter            | Description                                                                |
-| -------------------- | -------------------------------------------------------------------------- |
-| `model_name_or_path` | Model source: Hugging Face ID, local path, or Azure AI Model registry ID   |
-| `output_path`        | Where to save the optimized model                                          |
-| `device`             | Target hardware: `cpu`, `gpu`, or `npu`                                    |
+| Parameter            | Description                                                                       |
+| -------------------- | --------------------------------------------------------------------------------- |
+| `model_name_or_path` | Model source: HuggingFace ID, local path, or Azure AI Model registry ID           |
+| `output_path`        | Where to save the optimized model                                                 |
+| `device`             | Target hardware: `cpu`, `gpu`, or `npu`                                           |
 | `provider`           | Execution provider (for example, `CPUExecutionProvider`, `CUDAExecutionProvider`) |
-| `precision`          | Model precision: `fp16`, `fp32`, `int4`, or `int8`                         |
-| `use_ort_genai`      | Creates inference configuration files                                      |
-
+| `precision`          | Model precision: `fp16`, `fp32`, `int4`, or `int8`                                |
+| `use_ort_genai`      | Creates inference configuration files                                             |
 
 > [!TIP]
-> If you have a local copy of the model, you can use a local path instead of the Hugging Face ID. For example, `--model_name_or_path models/llama-3.2-1B-Instruct`. Olive handles the conversion, optimization, and quantization automatically.
+> If you have a local copy of the model, you can use a local path instead of the HuggingFace ID. For example, `--model_name_or_path models/llama-3.2-1B-Instruct`. Olive handles the conversion, optimization, and quantization automatically.
 
 ### Step 2: Rename the output model
 
@@ -161,10 +159,10 @@ Foundry Local requires a chat template JSON file called `inference_model.json` i
 }
 ```
 
-To create the chat template file, you can use the `apply_chat_template` method from the Hugging Face library:
+To create the chat template file, you can use the `apply_chat_template` method from the HuggingFace library:
 
 > [!NOTE]
-> The following example uses the Python Hugging Face library to create a chat template. The Hugging Face library is a dependency for Olive, so if you're using the same Python virtual environment you don't need to install. If you're using a different environment, install the library with `pip install transformers`.
+> The following example uses the Python HuggingFace library to create a chat template. The HuggingFace library is a dependency for Olive, so if you're using the same Python virtual environment you don't need to install. If you're using a different environment, install the library with `pip install transformers`.
 
 ```python
 # generate_inference_model.py
diff --git a/articles/ai-foundry-local/tutorials/chat-application-with-open-web-ui.md b/articles/ai-foundry-local/tutorials/chat-application-with-open-web-ui.md
@@ -33,6 +33,7 @@ Before you start this tutorial, you need:
 1. **Install Open Web UI** by following the instructions from the [Open Web UI GitHub repository](https://github.com/open-webui/open-webui).
 
 2. **Launch Open Web UI** with this command in your terminal:
+
    ```bash
    open-webui serve
    ```
@@ -41,18 +42,18 @@ Before you start this tutorial, you need:
 
 4. **Connect Open Web UI to Foundry Local**:
 
-   - Click **Settings** in the navigation menu
-   - Select **Connections**
-   - Click **Manage Direct Connections**
-   - Click the **+** icon to add a connection
-   - Enter `http://localhost:5272/v1` for the URL
-   - Type any value (like `test`) for the API Key, since it cannot be empty
-   - Save your connection
+   1. Click **Settings** in the navigation menu
+   2. Select **Connections**
+   3. Click **Manage Direct Connections**
+   4. Click the **+** icon to add a connection
+   5. Enter `http://localhost:5272/v1` for the URL
+   6. Type any value (like `test`) for the API Key, since it cannot be empty
+   7. Save your connection
 
 5. **Start chatting with your model**:
-   - Your loaded models will appear in the dropdown at the top
-   - Select any model from the list
-   - Type your message in the input box at the bottom
+   1. Your loaded models will appear in the dropdown at the top
+   2. Select any model from the list
+   3. Type your message in the input box at the bottom
 
 That's it! You're now chatting with an AI model running entirely on your local device.