review feedback addressed

samuel100 · samuel100 · commit 43e1e3831961 · 2025-05-09T21:31:26.000+01:00
diff --git a/articles/ai-foundry-local/concepts/foundry-local-architecture.md b/articles/ai-foundry-local/concepts/foundry-local-architecture.md
@@ -1,5 +1,5 @@
 ---
-title: Foundry Local Architecture
+title: Foundry Local architecture
 titleSuffix: Foundry Local
 description: Learn about the architecture and components of Foundry Local
 manager: scottpolly
@@ -11,7 +11,7 @@ ms.author: samkemp
 author: samuel100
 ---
 
-# Foundry Local Architecture
+# Foundry Local architecture
 
 Foundry Local enables efficient, secure, and scalable AI model inference directly on your devices. This article explains the core components of Foundry Local and how they work together to deliver AI capabilities.
 
@@ -27,13 +27,13 @@ Key benefits of Foundry Local include:
 > - **Offline Operation**: Work without an internet connection in remote or disconnected environments.
 > - **Seamless Integration**: Easily incorporate into existing development workflows for smooth adoption.
 
-## Key Components
+## Key components
 
 The Foundry Local architecture consists of these main components:
 
-:::image type="content" source="../media/architecture/foundry-local-arch.png" alt-text="Foundry Local Architecture Diagram":::
+:::image type="content" source="../media/architecture/foundry-local-arch.png" alt-text="Diagram of Foundry Local Architecture":::
 
-### Foundry Local Service
+### Foundry Local service
 
 The Foundry Local Service is an OpenAI-compatible REST server that provides a standard interface for working with the inference engine and managing models. Developers use this API to send requests, run models, and get results programmatically.
 
@@ -42,7 +42,7 @@ The Foundry Local Service is an OpenAI-compatible REST server that provides a st
   - Connect Foundry Local to your custom applications
   - Execute models through HTTP requests
 
-### ONNX Runtime
+### ONNX runtime
 
 The ONNX Runtime is a core component that executes AI models. It runs optimized ONNX models efficiently on local hardware like CPUs, GPUs, or NPUs.
 
@@ -53,11 +53,11 @@ The ONNX Runtime is a core component that executes AI models. It runs optimized
 - Delivers best-in-class performance
 - Supports quantized models for faster inference
 
-### Model Management
+### Model management
 
 Foundry Local provides robust tools for managing AI models, ensuring that they're readily available for inference and easy to maintain. Model management is handled through the **Model Cache** and the **Command-Line Interface (CLI)**.
 
-#### Model Cache
+#### Model cache
 
 The model cache stores downloaded AI models locally on your device, which ensures models are ready for inference without needing to download them repeatedly. You can manage the cache using either the Foundry CLI or REST API.
 
@@ -67,29 +67,29 @@ The model cache stores downloaded AI models locally on your device, which ensure
   - `foundry cache remove <model-name>`: Removes a specific model from the cache
   - `foundry cache cd <path>`: Changes the storage location for cached models
 
-#### Model Lifecycle
+#### Model lifecycle
 
 1. **Download**: Get models from the Azure AI Foundry model catalog and save them to your local disk.
 2. **Load**: Load models into the Foundry Local service memory for inference. Set a TTL (time-to-live) to control how long the model stays in memory (default: 10 minutes).
 3. **Run**: Execute model inference for your requests.
 4. **Unload**: Remove models from memory to free up resources when no longer needed.
 5. **Delete**: Remove models from your local cache to reclaim disk space.
 
-#### Model Compilation using Olive
+#### Model compilation using Olive
 
 Before models can be used with Foundry Local, they must be compiled and optimized in the [ONNX](https://onnx.ai) format. Microsoft provides a selection of published models in the Azure AI Foundry Model Catalog that are already optimized for Foundry Local. However, you aren't limited to those models - by using [Olive](https://microsoft.github.io/Olive/). Olive is a powerful framework for preparing AI models for efficient inference. It converts models into the ONNX format, optimizes their graph structure, and applies techniques like quantization to improve performance on local hardware.
 
 > [!TIP]
-> To learn more about compiling models for Foundry Local, read [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-hf-models.md).
+> To learn more about compiling models for Foundry Local, read [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-huggingface-models.md).
 
-### Hardware Abstraction Layer
+### Hardware abstraction layer
 
 The hardware abstraction layer ensures that Foundry Local can run on various devices by abstracting the underlying hardware. To optimize performance based on the available hardware, Foundry Local supports:
 
 - **multiple _execution providers_**, such as NVIDIA CUDA, AMD, Qualcomm, Intel.
 - **multiple _device types_**, such as CPU, GPU, NPU.
 
-### Developer Experiences
+### Developer experiences
 
 The Foundry Local architecture is designed to provide a seamless developer experience, enabling easy integration and interaction with AI models.
 Developers can choose from various interfaces to interact with the system, including:
@@ -107,7 +107,7 @@ The Foundry CLI is a powerful tool for managing models, the inference engine, an
 > [!TIP]
 > To learn more about the CLI commands, read [Foundry Local CLI Reference](../reference/reference-cli.md).
 
-#### Inferencing SDK Integration
+#### Inferencing SDK integration
 
 Foundry Local supports integration with various SDKs, such as the OpenAI SDK, enabling developers to use familiar programming interfaces to interact with the local inference engine.
 
diff --git a/articles/ai-foundry-local/get-started.md b/articles/ai-foundry-local/get-started.md
@@ -93,5 +93,5 @@ foundry cache --help
 - [Explore the Foundry Local documentation](index.yml)
 - [Learn about best practices and troubleshooting](reference/reference-best-practice.md)
 - [Explore the Foundry Local API reference](reference/reference-catalog-api.md)
-- [Learn how to compile Hugging Face models](how-to/how-to-compile-hf-models.md)
+- [Learn how to compile Hugging Face models](how-to/how-to-compile-huggingface-models.md)
 
diff --git a/articles/ai-foundry-local/how-to/how-to-compile-huggingface-models.md b/articles/ai-foundry-local/how-to/how-to-compile-huggingface-models.md
diff --git a/articles/ai-foundry-local/how-to/integrate-with-inference-sdks.md b/articles/ai-foundry-local/how-to/integrate-with-inference-sdks.md
@@ -1,5 +1,5 @@
 ---
-title: Integrate with Inference SDKs
+title: Integrate with inference SDKs
 titleSuffix: Foundry Local
 description: This article provides instructions on how to integrate Foundry Local with common Inferencing SDKs.
 manager: scottpolly
@@ -12,7 +12,7 @@ zone_pivot_groups: azure-ai-model-catalog-samples-chat
 author: samuel100
 ---
 
-# Integrate Foundry Local with Inferencing SDKs
+# Integrate Foundry Local with inferencing SDKs
 
 Foundry Local provides a REST API endpoint that makes it easy to integrate with various inferencing SDKs and programming languages. This guide shows you how to connect your applications to locally running AI models using popular SDKs.
 
@@ -48,5 +48,5 @@ When Foundry Local is running, it exposes an OpenAI-compatible REST API endpoint
 
 ## Next steps
 
-- [How to compile Hugging Face models to run on Foundry Local](how-to-compile-hf-models.md)
+- [How to compile Hugging Face models to run on Foundry Local](how-to-compile-huggingface-models.md)
 - [Explore the Foundry Local CLI reference](../reference/reference-cli.md)
diff --git a/articles/ai-foundry-local/includes/integrate-examples/csharp.md b/articles/ai-foundry-local/includes/integrate-examples/csharp.md
@@ -1,3 +1,12 @@
+---
+ms.service: azure-ai-foundry
+ms.custom: build-2025
+ms.topic: reference
+ms.date: 05/02/2025
+ms.author: maanavdalal
+author: maanavd
+---
+
 ## Basic Integration
 
 ```csharp
diff --git a/articles/ai-foundry-local/includes/integrate-examples/javascript.md b/articles/ai-foundry-local/includes/integrate-examples/javascript.md
@@ -1,3 +1,12 @@
+---
+ms.service: azure-ai-foundry
+ms.custom: build-2025
+ms.topic: reference
+ms.date: 05/02/2025
+ms.author: maanavdalal
+author: maanavd
+---
+
 ## Using the OpenAI Node.js SDK
 
 ```javascript
diff --git a/articles/ai-foundry-local/includes/integrate-examples/python.md b/articles/ai-foundry-local/includes/integrate-examples/python.md
@@ -1,3 +1,12 @@
+---
+ms.service: azure-ai-foundry
+ms.custom: build-2025
+ms.topic: reference
+ms.date: 05/02/2025
+ms.author: maanavdalal
+author: maanavd
+---
+
 ## Using the OpenAI SDK
 
 ```python
diff --git a/articles/ai-foundry-local/includes/integrate-examples/rest.md b/articles/ai-foundry-local/includes/integrate-examples/rest.md
@@ -1,3 +1,12 @@
+---
+ms.service: azure-ai-foundry
+ms.custom: build-2025
+ms.topic: reference
+ms.date: 05/02/2025
+ms.author: maanavdalal
+author: maanavd
+---
+
 ## Basic Request
 
 For quick tests or integrations with command line scripts:
diff --git a/articles/ai-foundry-local/includes/sdk-examples/csharp.md b/articles/ai-foundry-local/includes/sdk-examples/csharp.md
diff --git a/articles/ai-foundry-local/includes/sdk-examples/javascript.md b/articles/ai-foundry-local/includes/sdk-examples/javascript.md
@@ -1,3 +1,13 @@
+---
+ms.service: azure-ai-foundry
+ms.custom: build-2025
+ms.topic: reference
+ms.date: 05/02/2025
+ms.author: maanavdalal
+author: maanavd
+---
+
+
 ## JavaScript SDK Reference
 
 ### Installation
diff --git a/articles/ai-foundry-local/includes/sdk-examples/python.md b/articles/ai-foundry-local/includes/sdk-examples/python.md
@@ -1,3 +1,12 @@
+---
+ms.service: azure-ai-foundry
+ms.custom: build-2025
+ms.topic: reference
+ms.date: 05/02/2025
+ms.author: maanavdalal
+author: maanavd
+---
+
 ## Python SDK Reference
 
 ### Installation
diff --git a/articles/ai-foundry-local/index.yml b/articles/ai-foundry-local/index.yml
@@ -44,7 +44,7 @@ landingContent:
           - text: Integrate with Inferencing SDKs
             url: how-to/integrate-with-inference-sdks.md
           - text: Compile Hugging Face models to run on Foundry Local
-            url: how-to/how-to-compile-hf-models.md
+            url: how-to/how-to-compile-huggingface-models.md
   # Card
   - title: Reference
     linkLists:
diff --git a/articles/ai-foundry-local/reference/reference-best-practice.md b/articles/ai-foundry-local/reference/reference-best-practice.md
@@ -45,9 +45,9 @@ Foundry Local is designed for on-device inference and *not* distributed, contain
 
 ### Improving performance
 
-If you experience slow inference:
+If you experience slow inference, consider the following strategies:
 
-1. Use GPU acceleration when available
-2. Identify bottlenecks by monitoring memory usage during inference.
-3. Try more quantized model variants (like INT8 instead of FP16)
-4. Adjust batch sizes for non-interactive workloads
+- Use GPU acceleration when available
+- Identify bottlenecks by monitoring memory usage during inference.
+- Try more quantized model variants (like INT8 instead of FP16)
+- Adjust batch sizes for non-interactive workloads
diff --git a/articles/ai-foundry-local/toc.yml b/articles/ai-foundry-local/toc.yml
@@ -16,7 +16,7 @@ items:
   - name: Integrate with Inferencing SDKs
     href: how-to/integrate-with-inference-sdks.md
   - name: Compile Hugging Face models to run on Foundry Local
-    href: how-to/how-to-compile-hf-models.md
+    href: how-to/how-to-compile-huggingface-models.md
 - name: Tutorials
   expanded: true
   items:
diff --git a/articles/ai-foundry-local/tutorials/chat-application-with-open-web-ui.md b/articles/ai-foundry-local/tutorials/chat-application-with-open-web-ui.md
@@ -59,4 +59,4 @@ That's it! You're now chatting with an AI model running entirely on your local d
 ## Next steps
 
 - [Build an application with LangChain](use-langchain-with-foundry-local.md)
-- [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-hf-models.md)
+- [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-huggingface-models.md)
diff --git a/articles/ai-foundry-local/tutorials/use-langchain-with-foundry-local.md b/articles/ai-foundry-local/tutorials/use-langchain-with-foundry-local.md
@@ -82,4 +82,4 @@ print(ai_msg)
 ## Next steps
 
 - Explore the [LangChain documentation](https://python.langchain.com/docs/introduction) for more advanced features and capabilities.
-- [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-hf-models.md)
+- [How to compile Hugging Face models to run on Foundry Local](../how-to/how-to-compile-huggingface-models.md)
diff --git a/articles/ai-foundry-local/what-is-foundry-local.md b/articles/ai-foundry-local/what-is-foundry-local.md
@@ -49,5 +49,5 @@ Install and run your first model by following the [Get started with Foundry Loca
 ## Next steps
 
 - [Get started with Foundry Local](get-started.md)
-- [How to compile Hugging Face models to run on Foundry Local](how-to/how-to-compile-hf-models.md)
+- [How to compile Hugging Face models to run on Foundry Local](how-to/how-to-compile-huggingface-models.md)