Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 48 additions & 52 deletions plugin/skills/microsoft-foundry/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,18 @@
---
name: microsoft-foundry
description: "Deploy, evaluate, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, container start, batch eval, prompt optimization, agent.yaml, dataset curation from traces. USE FOR: deploy agent to Foundry, hosted agent, create agent, invoke agent, evaluate agent, run batch eval, optimize prompt, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, create dataset from traces, dataset versioning, eval trending, create AI Services, Cognitive Services, create Foundry resource, provision resource, knowledge index, agent monitoring, customize deployment, onboard, availability. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare)."
description: "Deploy, evaluate, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, container start, batch eval, prompt optimization, agent.yaml, dataset curation from traces. USE FOR: deploy agent to Foundry, hosted agent, create agent, invoke agent, evaluate agent, run batch eval, optimize prompt, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, create dataset from traces, dataset versioning, eval trending, create AI Services, Cognitive Services, create Foundry resource, provision resource, knowledge index, agent monitoring, customize deployment, onboard, availability, standard agent setup, capability host. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare)."
license: MIT
metadata:
author: Microsoft
version: "1.0.2"
version: "1.0.3"
---

# Microsoft Foundry Skill

This skill helps developers work with Microsoft Foundry resources, covering model discovery and deployment, complete dev lifecycle of AI agent, evaluation workflows, and troubleshooting.
> **MANDATORY:** Read this skill and the relevant sub-skill BEFORE calling any Foundry MCP tool.

## Sub-Skills

> **MANDATORY: Before executing ANY workflow, you MUST read the corresponding sub-skill document.** Do not call MCP tools for a workflow without reading its skill document. This applies even if you already know the MCP tool parameters — the skill document contains required workflow steps, pre-checks, and validation logic that must be followed. This rule applies on every new user message that triggers a different workflow, even if the skill is already loaded.

This skill includes specialized sub-skills for specific workflows. **Use these instead of the main skill when they match your task:**

| Sub-Skill | When to Use | Reference |
|-----------|-------------|-----------|
| **deploy** | Containerize, build, push to ACR, create/update/start/stop/clone agent deployments | [deploy](foundry-agent/deploy/deploy.md) |
Expand All @@ -32,60 +28,59 @@
| **quota** | Managing quotas and capacity for Microsoft Foundry resources. Use when checking quota usage, troubleshooting deployment failures due to insufficient quota, requesting quota increases, or planning capacity. | [quota/quota.md](quota/quota.md) |
| **rbac** | Managing RBAC permissions, role assignments, managed identities, and service principals for Microsoft Foundry resources. Use for access control, auditing permissions, and CI/CD setup. | [rbac/rbac.md](rbac/rbac.md) |

> 💡 **Tip:** For a complete onboarding flow: `project/create` → agent workflows (`deploy` → `invoke`).

> 💡 **Model Deployment:** Use `models/deploy-model` for all deployment scenarios — it intelligently routes between quick preset deployment, customized deployment with full control, and capacity discovery across regions.

## Agent Development Lifecycle

Match user intent to the correct workflow. Read each sub-skill in order before executing.
Onboarding flow: `project/create` → `deploy` → `invoke`

| User Intent | Workflow (read in order) |
|-------------|------------------------|
| Create a new agent from scratch | [create](foundry-agent/create/create.md) → [deploy](foundry-agent/deploy/deploy.md) → [invoke](foundry-agent/invoke/invoke.md) |
| Deploy an agent (code already exists) | deploy → invoke |
| Update/redeploy an agent after code changes | deploy → invoke |
| Invoke/test/chat with an agent | invoke |
| Troubleshoot an agent issue | invoke → troubleshoot |
| Fix a broken agent (troubleshoot + redeploy) | invoke → troubleshoot → apply fixes → deploy → invoke |
| Start/stop agent container | deploy |
## Agent Lifecycle

## Agent: Project Context Resolution
| Intent | Workflow |
|--------|----------|
| New agent from scratch | create → deploy → invoke |
| Deploy existing code | deploy → invoke |
| Test/chat with agent | invoke |
| Troubleshoot | invoke → troubleshoot |
| Fix + redeploy | troubleshoot → fix → deploy → invoke |

Agent skills should run this step **only when they need configuration values they don't already have**. If a value (e.g., project endpoint, agent name) is already known from the user's message or a previous skill in the same session, skip resolution for that value.
## Project Context Resolution

### Step 1: Detect azd Project
Resolve only missing values. Extract from user message first, then azd, then ask.

If any required configuration value is missing, check if `azure.yaml` exists in the project root (workspace root or user-specified project path). If found, run `azd env get-values` to load environment variables.
1. Check for `azure.yaml`; if found, run `azd env get-values`
2. Map azd variables:

### Step 2: Resolve Common Configuration
| azd Variable | Resolves To |
|-------------|-------------|
| `AZURE_AI_PROJECT_ENDPOINT` / `AZURE_AIPROJECT_ENDPOINT` | Project endpoint |
| `AZURE_CONTAINER_REGISTRY_NAME` / `AZURE_CONTAINER_REGISTRY_ENDPOINT` | ACR registry |
| `AZURE_SUBSCRIPTION_ID` | Subscription |

Match missing values against the azd environment:
3. Ask user only for unresolved values (project endpoint, agent name)

| azd Variable | Resolves To | Used By |
|-------------|-------------|---------|
| `AZURE_AI_PROJECT_ENDPOINT` or `AZURE_AIPROJECT_ENDPOINT` | Project endpoint | deploy, invoke, troubleshoot |
| `AZURE_CONTAINER_REGISTRY_NAME` or `AZURE_CONTAINER_REGISTRY_ENDPOINT` | ACR registry name / image URL prefix | deploy |
| `AZURE_SUBSCRIPTION_ID` | Azure subscription | troubleshoot |
## Validation

### Step 3: Collect Missing Values
After each workflow step, validate before proceeding:
1. Run the operation
2. Check output for errors or unexpected results
3. If failed → diagnose using troubleshoot sub-skill → fix → retry
4. Only proceed to next step when validation passes

Use the `ask_user` or `askQuestions` tool **only for values not resolved** from the user's message, session context, or azd environment. Common values skills may need:
- **Project endpoint** — AI Foundry project endpoint URL
- **Agent name** — Name of the target agent

> 💡 **Tip:** If the user provides a project endpoint or agent name in their initial message, extract it directly — do not ask again.

## Agent: Agent Types

All agent skills support two agent types:
## Agent Types

| Type | Kind | Description |
|------|------|-------------|
| **Prompt** | `"prompt"` | LLM-based agents backed by a model deployment |
| **Hosted** | `"hosted"` | Container-based agents running custom code |
| **Prompt** | `"prompt"` | LLM-based, backed by model deployment |
| **Hosted** | `"hosted"` | Container-based, running custom code |

## Agent: Setup Types

| Setup | Capability Host | Description |
|-------|----------------|-------------|
| **Basic** | None | Default. All resources Microsoft-managed. |
| **Standard** | Azure AI Services | Bring-your-own storage and search (public network). See [standard-agent-setup](references/standard-agent-setup.md). |
| **Standard + Private Network** | Azure AI Services | Standard setup with VNet isolation and private endpoints. See [private-network-standard-agent-setup](references/private-network-standard-agent-setup.md). |

Use `agent_get` MCP tool to determine an agent's type when needed.
> **MANDATORY:** For standard setup, read the appropriate reference before proceeding:
> - **Public network:** [references/standard-agent-setup.md](references/standard-agent-setup.md)
> - **Private network (VNet isolation):** [references/private-network-standard-agent-setup.md](references/private-network-standard-agent-setup.md)

## Tool Usage Conventions

Expand All @@ -94,12 +89,13 @@
- Prefer Azure MCP tools over direct CLI commands when available
- Reference official Microsoft documentation URLs instead of embedding CLI command syntax

## Additional Resources
## References

- [Foundry Hosted Agents](https://learn.microsoft.com/azure/ai-foundry/agents/concepts/hosted-agents?view=foundry)
- [Foundry Agent Runtime Components](https://learn.microsoft.com/azure/ai-foundry/agents/concepts/runtime-components?view=foundry)
- [Hosted Agents](https://learn.microsoft.com/azure/ai-foundry/agents/concepts/hosted-agents?view=foundry)
- [Runtime Components](https://learn.microsoft.com/azure/ai-foundry/agents/concepts/runtime-components?view=foundry)
- [Foundry Samples](https://github.com/azure-ai-foundry/foundry-samples)
- [Python SDK](references/sdk/foundry-sdk-py.md)

## SDK Quick Reference
## Dependencies

- [Python](references/sdk/foundry-sdk-py.md)
Scripts in sub-skills require: Azure CLI (`az`) ≥2.0, `jq` (for shell scripts). Install via `pip install azure-ai-projects azure-identity` for Python SDK usage.
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Ready-to-use KQL templates for querying GenAI OpenTelemetry traces in Application Insights.

**Table of Contents:** [App Insights Table Mapping](#app-insights-table-mapping) · [Key GenAI OTel Attributes](#key-genai-otel-attributes) · [Span Correlation](#span-correlation) · [Hosted Agent Attributes](#hosted-agent-attributes) · [Response ID Formats](#response-id-formats) · [Common Query Templates](#common-query-templates) · [OTel Reference Links](#otel-reference-links)

## App Insights Table Mapping

| App Insights Table | GenAI Data |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "Interactive guided deployment flow for Azure OpenAI models with fu
license: MIT
metadata:
author: Microsoft
version: "1.0.0"
version: "1.0.1"
---

# Customize Model Deployment
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

> Reference for: `models/deploy-model/customize/SKILL.md`

**Table of Contents:** [Selection Guides](#selection-guides) · [Advanced Topics](#advanced-topics)

## Selection Guides

### How to Choose SKU
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: "Intelligently deploys Azure OpenAI models to optimal regions by an
license: MIT
metadata:
author: Microsoft
version: "1.0.0"
version: "1.0.1"
---

# Deploy Model to Optimal Region
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Condensed implementation reference for preset (optimal region) model deployment. See [SKILL.md](../SKILL.md) for overview.

**Table of Contents:** [Phase 1: Verify Authentication](#phase-1-verify-authentication) · [Phase 2: Get Current Project](#phase-2-get-current-project) · [Phase 3: Get Model Name](#phase-3-get-model-name) · [Phase 4: Check Current Region Capacity](#phase-4-check-current-region-capacity) · [Phase 5: Query Multi-Region Capacity](#phase-5-query-multi-region-capacity) · [Phase 6: Select Region and Project](#phase-6-select-region-and-project) · [Phase 7: Deploy Model](#phase-7-deploy-model)

---

## Phase 1: Verify Authentication
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ allowed-tools: Read, Write, Bash, AskUserQuestion

Create a new Azure AI Foundry project using azd. Provisions: Foundry account, project, Application Insights, managed identity, and RBAC permissions. Optionally enables hosted agents (capability host + Container Registry).

**Table of Contents:** [Prerequisites](#prerequisites) · [Workflow](#workflow) · [Best Practices](#best-practices) · [Troubleshooting](#troubleshooting) · [Related Skills](#related-skills) · [Resources](#resources)

## Prerequisites

Run checks in order. STOP on any failure and resolve before proceeding.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Comprehensive guide for planning Azure AI Foundry capacity, including cost analysis, model selection, and workload calculations.

**Table of Contents:** [Cost Comparison: TPM vs PTU](#cost-comparison-tpm-vs-ptu) · [Production Workload Examples](#production-workload-examples) · [Model Selection and Deployment Type Guidance](#model-selection-and-deployment-type-guidance)

## Cost Comparison: TPM vs PTU

> **Official Pricing Sources:**
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Error Resolution Workflows

**Table of Contents:** [Workflow 7: Quota Exhausted Recovery](#workflow-7-quota-exhausted-recovery) · [Workflow 8: Resolve 429 Rate Limit Errors](#workflow-8-resolve-429-rate-limit-errors) · [Workflow 9: Resolve DeploymentLimitReached](#workflow-9-resolve-deploymentlimitreached) · [Workflow 10: Resolve InsufficientQuota](#workflow-10-resolve-insufficientquota) · [Workflow 11: Resolve QuotaExceeded](#workflow-11-resolve-quotaexceeded)

## Workflow 7: Quota Exhausted Recovery

**A. Deploy to Different Region**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Comprehensive strategies for optimizing Azure AI Foundry quota allocation and reducing costs.

**Table of Contents:** [1. Identify and Delete Unused Deployments](#1-identify-and-delete-unused-deployments) · [2. Right-Size Over-Provisioned Deployments](#2-right-size-over-provisioned-deployments) · [3. Consolidate Multiple Small Deployments](#3-consolidate-multiple-small-deployments) · [4. Cost Optimization Strategies](#4-cost-optimization-strategies) · [5. Regional Quota Rebalancing](#5-regional-quota-rebalancing)

## 1. Identify and Delete Unused Deployments

**Step 1: Discovery with Quota Context**
Expand Down
2 changes: 2 additions & 0 deletions plugin/skills/microsoft-foundry/quota/references/ptu-guide.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Provisioned Throughput Units (PTU) Guide

**Table of Contents:** [Understanding PTU vs Standard TPM](#understanding-ptu-vs-standard-tpm) · [When to Use PTU](#when-to-use-ptu) · [PTU Capacity Planning](#ptu-capacity-planning) · [Deploy Model with PTU](#deploy-model-with-ptu) · [Request PTU Quota Increase](#request-ptu-quota-increase) · [Understanding Region and Deployment Quotas](#understanding-region-and-deployment-quotas) · [External Resources](#external-resources)

## Understanding PTU vs Standard TPM

Microsoft Foundry offers two quota types:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Troubleshooting Quota Errors

**Table of Contents:** [Common Quota Errors](#common-quota-errors) · [Detailed Error Resolution](#detailed-error-resolution) · [Request Quota Increase Process](#request-quota-increase-process) · [Diagnostic Commands](#diagnostic-commands) · [External Resources](#external-resources)

## Common Quota Errors

| Error | Cause | Quick Fix |
Expand Down
2 changes: 2 additions & 0 deletions plugin/skills/microsoft-foundry/quota/references/workflows.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Detailed Workflows: Quota Management

**Table of Contents:** [Workflow 1: View Current Quota Usage](#workflow-1-view-current-quota-usage---detailed-steps) · [Workflow 2: Find Best Region for Model Deployment](#workflow-2-find-best-region-for-model-deployment---detailed-steps) · [Workflow 3: Check Quota Before Deployment](#workflow-3-check-quota-before-deployment---detailed-steps) · [Workflow 4: Monitor Quota Across Deployments](#workflow-4-monitor-quota-across-deployments---detailed-steps) · [Quick Command Reference](#quick-command-reference) · [MCP Tools Reference](#mcp-tools-reference-optional-wrappers)

## Workflow 1: View Current Quota Usage - Detailed Steps

### Step 1: Show Regional Quota Summary (REQUIRED APPROACH)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

> Source: [Microsoft — Passwordless connections for Azure services](https://learn.microsoft.com/azure/developer/intro/passwordless-overview) and [Azure Identity client libraries](https://learn.microsoft.com/dotnet/azure/sdk/authentication/).

**Table of Contents:** [Golden Rule](#golden-rule) · [Authentication by Environment](#authentication-by-environment) · [Why Not DefaultAzureCredential in Production?](#why-not-defaultazurecredential-in-production) · [Production Patterns](#production-patterns) · [Local Development Setup](#local-development-setup) · [Environment-Aware Pattern](#environment-aware-pattern) · [Security Checklist](#security-checklist) · [Further Reading](#further-reading)

## Golden Rule

Use **managed identities** and **Azure RBAC** in production. Reserve `DefaultAzureCredential` for **local development only**.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Private Network Standard Agent Setup

> **MANDATORY:** Read [Standard Agent Setup with Network Isolation docs](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/configure-private-link?tabs=azure-portal&pivots=fdp-project) before proceeding. It covers RBAC requirements, resource provider registration, and role assignments.

## Overview

Extends [standard agent setup](standard-agent-setup.md) with full VNet isolation using private endpoints and subnet delegation. All resources communicate over private network only.

## Networking Constraints

Two subnets required:

| Subnet | CIDR | Purpose | Delegation |
|--------|------|---------|------------|
| Agent Subnet | /24 (e.g., 192.168.0.0/24) | Agent workloads | `Microsoft.App/environments` (exclusive) |
| Private Endpoint Subnet | /24 (e.g., 192.168.1.0/24) | Private endpoints | None |

- All Foundry resources **must be in the same region as the VNet**.
- Agent subnet must be exclusive to one Foundry account.
- VNet address space must not overlap with existing networks or reserved ranges.

> ⚠️ **Warning:** If providing an existing VNet, ensure both subnets exist before deployment. Otherwise the template creates a new VNet with default address spaces.

## Deployment

**Always use the official Bicep template:**
[Private Network Standard Agent Setup Bicep](https://github.com/microsoft-foundry/foundry-samples/tree/main/infrastructure/infrastructure-setup-bicep/15-private-network-standard-agent-setup)

> ⚠️ **Warning:** Capability host provisioning is **asynchronous** (10–20 minutes). Poll deployment status until success before proceeding.

## Post-Deployment

1. **Deploy a model** to the new AI Services account (e.g., `gpt-4o`). Fall back to `Standard` SKU if `GlobalStandard` quota is exhausted.
2. **Create the agent** using MCP tools (`agent_update`) or the Python SDK.

## References

- [Azure AI Foundry Networking](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/configure-private-link?tabs=azure-portal&pivots=fdp-project)
- [Azure AI Foundry RBAC](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/rbac-azure-ai-foundry?pivots=fdp-project)
- [Standard Agent Setup (public network)](standard-agent-setup.md)
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Python-specific implementations for working with Microsoft Foundry.

**Table of Contents:** [Prerequisites](#prerequisites) · [Model Discovery and Deployment](#model-discovery-and-deployment-mcp) · [RAG Agent with Azure AI Search](#rag-agent-with-azure-ai-search) · [Creating Agents](#creating-agents) · [Agent Evaluation](#agent-evaluation) · [Knowledge Index Operations](#knowledge-index-operations-mcp) · [Best Practices](#best-practices) · [Error Handling](#error-handling)

## Prerequisites

```bash
Expand Down
Loading
Loading