Skip to content

Commit 3f49956

Browse files
authored
feat: network module redesign for future growth (#130)
1 parent d80bc6c commit 3f49956

File tree

43 files changed

+1476
-878
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+1476
-878
lines changed

.github/copilot-instructions.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,9 +63,9 @@ Use when creating, modifying, or debugging integration tests.
6363

6464
### [Network](./skills/network/SKILL.md)
6565
Use when adding or modifying subnets, CIDR allocation, NSG rules, or delegation in the network module.
66-
- Subnet CIDR offset calculations in `infra-ai-hub/modules/network/locals.tf`
66+
- Explicit `subnet_allocation` model (`map(map(string))`) in `infra-ai-hub/modules/network/locals.tf`
6767
- NSG resources and azapi subnet definitions in `infra-ai-hub/modules/network/main.tf`
68-
- Address space allocation model (1 /24, 2 /24s, 4+ /24s)
68+
- PE subnet pool derivation and downstream PE selection (tenant 3-tier, APIM pinned)
6969
- Subnet delegation requirements per Azure service
7070

7171
### [External Docs Research](./skills/external-docs/SKILL.md)

.github/skills/api-management/SKILLS.md

Lines changed: 0 additions & 317 deletions
This file was deleted.

.github/skills/iac-coder/SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ If a gate cannot be run locally, state exactly what was not run and why.
8787
- Use Private Endpoints for all PaaS services
8888
- Set subnets as Private Subnets (Zero Trust)
8989
- Use existing VNet provided by platform team
90+
- After purging/recreating the AI Foundry resource, allow ~5 min for PE DNS propagation before running integration tests or the next apply — the destroy script confirms API-level deletion but not DNS propagation. See the [Failure Playbook](references/REFERENCE.md#️⚠️-critical-ai-foundry-private-endpoint-broken-after-purgeapply-deploymentnotfound-404) for full diagnosis steps.
9091

9192
## Detailed References
9293

.github/skills/iac-coder/references/REFERENCE.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,33 @@ To add a new tenant:
199199

200200
## Failure Playbook
201201

202+
### ⚠️ CRITICAL: AI Foundry Private Endpoint broken after purge+apply (`DeploymentNotFound` 404)
203+
204+
**Symptom:** All OpenAI inference integration tests fail with `DeploymentNotFound` (HTTP 404) even though:
205+
- All model deployments exist and show `provisioningState: Succeeded`
206+
- APIM backend URL is correct
207+
- APIM managed identity has `Cognitive Services OpenAI User` on the hub
208+
- `tenant-info.bats` and `document-intelligence.bats` pass (unrelated backends)
209+
210+
**Root cause:** The AI Foundry private endpoint (PE) is in a broken/inconsistent state. APIM can resolve the hub hostname but the PE is not correctly routing traffic. Azure returns `DeploymentNotFound` instead of a connectivity error, making it easy to misdiagnose as a deployment naming or RBAC issue.
211+
212+
**Most common trigger:** Manually purging the AI Foundry resource (or a full `terraform destroy` of the shared stack) then immediately re-applying. Even though Azure confirms PE deletion and PE recreation via the API, the PE's NIC/DNS binding can be stale for several minutes after Terraform reports success.
213+
214+
**Diagnosis:**
215+
1. Rule out APIM/policy by checking if `document-intelligence.bats` passes (per-tenant DocInt resources, different backend) — if DocInt passes and OpenAI fails, the issue is hub PE, not APIM.
216+
2. Check APIM MSI role: `az role assignment list --scope <hub_id> --assignee <apim_msi_principal_id> --query "[].roleDefinitionName"`
217+
3. Check deployment state: `az cognitiveservices account deployment show ... --query "properties.provisioningState"`
218+
4. If both are fine, **delete the Foundry private endpoint from the Azure portal** (or via `az network private-endpoint delete`) and re-apply to force a clean PE recreation.
219+
220+
**Fix:** Delete the private endpoint resource and re-apply the `shared` stack. Terraform will recreate it cleanly.
221+
222+
**Teardown script behaviour vs DNS propagation gap:**
223+
- `deploy-scaled.sh destroy` **does** block until full completion: each phase uses `wait "${pids[$i]}"` and Terraform confirms every resource deleted via the Azure API before exiting.
224+
- Destroy order: key-rotation → (foundry + apim + tenant-user-mgmt in parallel) → tenants → shared.
225+
- **Gap:** There is no post-destroy sleep for Azure's private DNS propagation after PE deletion. A rapid `destroy` + `apply` of the `shared` stack can recreate the Foundry PE with a stale NIC/DNS binding. Add a manual wait of ~5 minutes between destroy and apply when working with Foundry PE recreation, or delete only the PE (not the whole hub) when possible.
226+
227+
---
228+
202229
### Terraform drift or noisy plans
203230
- Re-check lifecycle blocks and `ignore_changes` intent before adding new ignores.
204231
- Verify module input defaults and conditional counts are stable.

.github/skills/network/SKILL.md

Lines changed: 131 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,19 @@
11
---
22
name: network
3-
description: Guidance for the network module's subnet allocation, CIDR calculations, NSG rules, and delegation requirements in ai-hub-tracking. Use when adding subnets, modifying address space allocation, changing NSG rules, or debugging subnet delegation issues.
3+
description: Guidance for the network module's subnet allocation, CIDR mapping, NSG rules, PE pool outputs, and delegation requirements in ai-hub-tracking. Use when adding subnets, modifying address allocation, changing NSG rules, updating PE pool logic, or debugging subnet delegation issues.
44
---
55

66
# Network Module Skills
77

8-
Use this skill profile when creating or modifying subnet allocation, CIDR calculations, NSG rules, or delegation configuration in the network module.
8+
Use this skill profile when creating or modifying subnet allocation, CIDR mapping, NSG rules, PE pool outputs, or delegation configuration in the network module.
99

1010
## Use When
1111
- Adding a new subnet type to the network module
12-
- Modifying CIDR allocation logic in `locals.tf`
12+
- Modifying subnet allocation in `params/{env}/shared.tfvars`
1313
- Changing or debugging NSG security rules for any subnet
1414
- Debugging subnet delegation issues (VNet integration failures)
1515
- Wiring a new subnet through the shared stack to a downstream stack
16+
- Modifying PE pool outputs or downstream PE subnet selection logic
1617

1718
## Do Not Use When
1819
- Modifying APIM policies/routing (use [API Management](../api-management/SKILL.md))
@@ -21,16 +22,16 @@ Use this skill profile when creating or modifying subnet allocation, CIDR calcul
2122

2223
## Input Contract
2324
Required context before changes:
24-
- Current subnet allocation map (see Allocation Map below)
25+
- Current `subnet_allocation` map in `params/{env}/shared.tfvars` (see Allocation Model below)
2526
- Target service's Azure-mandated delegation and minimum subnet size
26-
- VNet address space count for the target environment (1, 2, or 4+ /24s)
27-
- Which environments enable the subnet (`params/{env}/shared.tfvars`)
27+
- Which environments need the subnet
2828

2929
## Output Contract
3030
Every network module change should deliver:
31-
- Variable, CIDR calculation, NSG, subnet resource, and outputs in `modules/network/`
31+
- CIDR entry in `params/{env}/shared.tfvars` `subnet_allocation` map
32+
- NSG resource + subnet resource in `modules/network/main.tf` (if new subnet type)
33+
- Outputs in `modules/network/outputs.tf`
3234
- Shared stack wiring in `stacks/shared/main.tf` + `stacks/shared/outputs.tf`
33-
- Feature flag in all three `params/{dev,test,prod}/shared.tfvars`
3435
- `terraform fmt -recursive` on `modules/network/` and `stacks/shared/`
3536

3637
## External Documentation
@@ -40,63 +41,151 @@ Every network module change should deliver:
4041

4142
| Component | Location | Purpose |
4243
|---|---|---|
43-
| CIDR calculations | `infra-ai-hub/modules/network/locals.tf` | Subnet offset/base/CIDR derivations |
44-
| Subnet variables | `infra-ai-hub/modules/network/variables.tf` | Subnet toggle objects (`enabled`, `name`, `prefix_length`) |
44+
| CIDR mapping | `infra-ai-hub/modules/network/locals.tf` | Direct CIDR reads from `subnet_allocation`, PE pool derivation |
45+
| Subnet variable | `infra-ai-hub/modules/network/variables.tf` | `subnet_allocation` map(map(string)) with validations |
4546
| NSGs + subnets | `infra-ai-hub/modules/network/main.tf` | NSG resources, `azapi_resource` subnet definitions |
46-
| Outputs | `infra-ai-hub/modules/network/outputs.tf` | Subnet IDs, CIDRs, NSG IDs |
47-
| Shared stack wiring | `infra-ai-hub/stacks/shared/main.tf``module "network"` | Maps `shared_config` to module variables |
48-
| Shared stack outputs | `infra-ai-hub/stacks/shared/outputs.tf` | Exposes subnet IDs for cross-stack remote state |
49-
| Per-env config | `infra-ai-hub/params/{env}/shared.tfvars``shared_config` | Feature flags per subnet type |
47+
| Outputs | `infra-ai-hub/modules/network/outputs.tf` | Subnet IDs, CIDRs, NSG IDs, PE pool outputs |
48+
| Shared stack wiring | `infra-ai-hub/stacks/shared/main.tf``module "network"` | Passes `subnet_allocation` to module |
49+
| Shared stack outputs | `infra-ai-hub/stacks/shared/outputs.tf` | PE pool pass-through + backward-compat outputs |
50+
| Per-env config | `infra-ai-hub/params/{env}/shared.tfvars``subnet_allocation` | Full CIDRs per subnet per address space |
51+
| Tenant PE selection | `infra-ai-hub/stacks/tenant/locals.tf` | PE subnet resolution with 3-tier precedence |
52+
| APIM PE selection | `infra-ai-hub/stacks/apim/locals.tf` | Pinned PE subnet resolution with fallback |
5053

5154
## Architecture
5255

5356
VNets are pre-provisioned by the BC Gov Landing Zone — the module only creates **subnets within existing VNets**. Subnets are created in the **shared stack** and consumed by downstream stacks via `data.terraform_remote_state.shared`.
5457

5558
All subnets use `azapi_resource` (not `azurerm_subnet`) because Landing Zone policy requires NSG at creation time — `azapi_resource` does this atomically.
5659

57-
## Current Subnet Allocation Map
60+
## Subnet Allocation Model (`subnet_allocation`)
5861

59-
| Subnet | Delegation | Allocation Order |
62+
The network module uses a single `subnet_allocation` variable of type `map(map(string))`:
63+
- **Outer key** = address space CIDR (e.g., `"10.x.x.0/24"`)
64+
- **Inner key** = subnet name (e.g., `"privateendpoints-subnet"`)
65+
- **Inner value** = full subnet CIDR (e.g., `"10.x.x.0/27"`)
66+
67+
There is **no offset computation** — all CIDRs are explicit in tfvars. The module reads them directly via `merge()`.
68+
69+
### Known Subnet Names
70+
71+
| Subnet Name | Delegation | Purpose |
6072
|---|---|---|
61-
| PE | None (`privateEndpointNetworkPolicies = "Disabled"`) | Always first |
62-
| APIM | `Microsoft.Web/serverFarms` | After PE |
63-
| AppGW | None (dedicated, no delegation) | After APIM |
64-
| ACA | `Microsoft.App/environments` | After AppGW |
65-
| Func | `Microsoft.Web/serverFarms` | After ACA |
73+
| `privateendpoints-subnet` | None (`privateEndpointNetworkPolicies = "Disabled"`) | Primary PE subnet |
74+
| `privateendpoints-subnet-<n>` | None | Additional PE pool subnets (`<n>` starts at 1: `-1`, `-2`, ...) |
75+
| `apim-subnet` | `Microsoft.Web/serverFarms` | APIM VNet injection |
76+
| `appgw-subnet` | None (dedicated, no delegation) | Application Gateway |
77+
| `aca-subnet` | `Microsoft.App/environments` | Container Apps Environment |
78+
79+
### External VNet Peered Projects (`external_peered_projects`)
80+
81+
Optional map of external project names to their peered VNet config. When populated, the network module creates dynamic inbound NSG rules on the APIM subnet allowing direct HTTPS (443) traffic from these peered VNets — bypassing App Gateway. NSGs are stateful, so no outbound mirror rule is needed.
82+
83+
```hcl
84+
external_peered_projects = {
85+
"forest-client" = { cidrs = ["10.x.x.0/20"], priority = 400 }
86+
"nr-data-hub" = { cidrs = ["10.x.x.0/22", "10.x.x.0/22"], priority = 410 }
87+
}
88+
```
89+
90+
Priorities are caller-assigned (400–499) so adding/removing a project never shifts existing rules. Use gaps (400, 410, 420) for future growth.
6691

6792
**Critical rules:**
6893
- A subnet can only have **one** delegation — never share delegated subnets between services
69-
- APIM and Functions both use `Microsoft.Web/serverFarms` but **must use separate subnets**
7094
- AppGW subnet must have NO delegation
95+
- Each subnet CIDR must fall within its parent address space
7196

72-
## Address Space Scenarios (Quick Reference)
97+
### Current Environment Allocations
7398

74-
| Scenario | PE | Infrastructure |
99+
**Dev** — 1 address space:
100+
| Subnet | CIDR | Size |
75101
|---|---|---|
76-
| 1 × /24 | /27 at offset 0 | Sequential /27s in same /24 after PE |
77-
| 2 × /24s | Full first /24 as PE pool | Second /24 for infra |
78-
| 4+ × /24s | First 2 /24s for PE pool | Third + fourth /24s for infra |
102+
| `privateendpoints-subnet` | `10.x.x.0/27` | 32 IPs |
103+
| `apim-subnet` | `10.x.x.32/27` | 32 IPs |
104+
| `aca-subnet` | `10.x.x.64/27` | 32 IPs |
105+
| `appgw-subnet` | Not deployed | App Gateway disabled |
106+
107+
**Test** — 2 address spaces:
108+
| Space | Subnet | CIDR | Size |
109+
|---|---|---|---|
110+
| PE space /24 | `privateendpoints-subnet` | `10.x.x.0/24` | 256 IPs (dedicated PE space) |
111+
| Workload /24 | `apim-subnet` | `10.x.x.0/27` | 32 IPs |
112+
| Workload /24 | `appgw-subnet` | `10.x.x.32/27` | 32 IPs |
113+
| Workload /24 | `aca-subnet` | `10.x.x.64/27` | 32 IPs |
114+
115+
**Prod** — 4 address spaces (placeholder CIDRs, not yet deployed):
116+
| Space | Subnet | CIDR | Notes |
117+
|---|---|---|---|
118+
| Space 1 | `privateendpoints-subnet` | TBD /24 | PE pool space 1 |
119+
| Space 2 | `privateendpoints-subnet-1` | TBD /24 | PE pool space 2 |
120+
| Space 3 | `privateendpoints-subnet-2` | TBD /24 | PE pool space 3 |
121+
| Space 4 | `apim-subnet`, `appgw-subnet`, `aca-subnet` | TBD /27s | Workload space |
122+
123+
## PE Subnet Pool
124+
125+
The network module automatically derives a PE pool from all subnets whose name starts with `privateendpoints-subnet`:
126+
- Pool keys use the actual subnet names: `privateendpoints-subnet`, `privateendpoints-subnet-1`, `privateendpoints-subnet-2`, ...
127+
- Primary PE subnet key is always `privateendpoints-subnet` (the original, suffix-free name)
128+
- Pool outputs: `private_endpoint_subnet_ids_by_key`, `private_endpoint_subnet_cidrs_by_key`, `private_endpoint_subnet_keys_ordered`
129+
130+
### Downstream PE Consumption
131+
132+
**Tenant stack**`pe_subnet_key` is **mandatory** for every enabled tenant:
133+
- **Explicit `pe_subnet_key` in tenant config** (`var.tenants[key].pe_subnet_key`) — **ALWAYS set**, validated at plan time
134+
- Resolution is strict: invalid/missing key in the shared PE pool fails at plan time (no silent fallback)
135+
136+
Each tenant creates up to 5 PEs (Key Vault, AI Search, Cosmos DB, Document Intelligence, Speech Services). All PEs for a tenant land on the **same** subnet ("tenant affinity"). Storage Account has no PE (public access in Landing Zone).
137+
138+
Shared stack PEs (AI Foundry Hub, Language Service, Hub Key Vault) always use the primary `privateendpoints-subnet` (~4-5 PEs).
139+
140+
### PE Subnet Assignment Strategy
141+
142+
**Principle: assign-on-first-deploy, sticky forever.** Changing `pe_subnet_key` after deployment destroys and recreates **all 5 tenant PEs** (service disruption + DNS re-propagation).
143+
144+
**Capacity math:**
145+
- Each `/24` PE subnet holds ~251 usable IPs (Azure reserves 5)
146+
- Each tenant consumes up to 5 PE IPs → ~50 tenants per `/24` subnet
147+
- Shared stack consumes ~5 PEs on primary subnet (reducing tenant capacity to ~49 on primary)
148+
- Prod has 3 PE subnets → theoretical max ~148 tenants
149+
150+
**Assignment rules for new tenants:**
151+
1. Check current PE count per subnet (Azure Portal → subnet → Connected devices, or `az network vnet subnet show`)
152+
2. Assign the subnet with the most remaining capacity
153+
3. Record the key in the tenant's `pe_subnet_key` field — it is immutable after first apply
154+
4. Dev/test environments have only 1 PE subnet → always `"privateendpoints-subnet"`
155+
156+
**Tenant onboarding prerequisite:**
157+
Every new tenant tfvars **must** include `pe_subnet_key` inside the `tenant = { ... }` block. Terraform plan will fail validation if it is missing. Example:
158+
```hcl
159+
pe_subnet_key = "privateendpoints-subnet" # or "privateendpoints-subnet-1", etc.
160+
```
161+
162+
**APIM stack** — Pinned PE subnet:
163+
1. Explicit `var.apim_pe_subnet_key` (if set, looks up from shared PE pool)
164+
2. Fallback to primary `private_endpoint_subnet_id`
79165

80-
Each infra subnet's offset shifts by 32 for each enabled preceding subnet. See [references/REFERENCE.md](references/REFERENCE.md) for full CIDR calculation algorithm and visual diagrams.
166+
**Key-rotation / Foundry** — Out of PE pool scope (no PE subnet references).
81167

82168
## How to Add a New Subnet (Checklist)
83169

84-
1. **Variable** (`variables.tf`): Object with `enabled`, `name`, `prefix_length` — default `enabled = false`
85-
2. **CIDR** (`locals.tf`): Add `xxx_offset`, `xxx_base`, `xxx_subnet_cidr` after last subnet in chain
86-
3. **NSG + Subnet** (`main.tf`): NSG with `count`, `azapi_resource` with delegation + `depends_on` all preceding subnets
87-
4. **Outputs** (`outputs.tf`): `xxx_subnet_id`, `xxx_subnet_cidr`, `xxx_nsg_id`
88-
5. **Shared stack** (`stacks/shared/main.tf` + `outputs.tf`): Wire variable + expose output
89-
6. **Env config**: Add `xxx_subnet_enabled = false` in all `params/{dev,test,prod}/shared.tfvars`
90-
7. **Downstream stack**: Reference via `try(data.terraform_remote_state.shared.outputs.xxx_subnet_id, null)`
170+
1. **tfvars** (`params/{env}/shared.tfvars`): Add CIDR entry under `subnet_allocation` in the appropriate address space
171+
2. **Enabled flag** (`modules/network/locals.tf`): Add `xxx_enabled = contains(keys(local.subnet_cidrs), "xxx-subnet")`
172+
3. **CIDR local** (`modules/network/locals.tf`): Add `xxx_subnet_cidr = local.xxx_enabled ? local.subnet_cidrs["xxx-subnet"] : null`
173+
4. **Validation** (`modules/network/variables.tf`): Add subnet name to allowed names list in validation block
174+
5. **NSG + Subnet** (`modules/network/main.tf`): NSG with `count`, `azapi_resource` with delegation + `depends_on` all preceding subnets
175+
6. **Outputs** (`modules/network/outputs.tf`): `xxx_subnet_id`, `xxx_subnet_cidr`, `xxx_nsg_id`
176+
7. **Shared stack** (`stacks/shared/main.tf` + `outputs.tf`): Wire variable + expose output
177+
8. **Downstream stack**: Reference via `try(data.terraform_remote_state.shared.outputs.xxx_subnet_id, null)`
91178

92179
## Validation Gates (Required)
93-
1. **CIDR overlap:** Walk offset formula for all 3 address space scenarios — no two subnets same CIDR
94-
2. **Delegation:** Matches Azure requirement for the target service
95-
3. **depends_on chain:** New subnet depends on ALL preceding subnets
96-
4. **NSG at creation:** Included in `azapi_resource` body, not a separate association
97-
5. **Shared stack output:** Subnet ID exposed for cross-stack consumption
98-
6. **Feature flag:** Defaults to `enabled = false` in all environments
99-
7. **Format:** `terraform fmt -recursive` on `modules/network/` and `stacks/shared/`
180+
1. **CIDR validity:** All values in `subnet_allocation` must be valid CIDRs (`can(cidrhost(cidr, 0))`)
181+
2. **Subnet names:** Must match known names or `privateendpoints-subnet-<n>` pattern where `<n>` starts at 1
182+
3. **PE requirement:** At least one `privateendpoints-subnet` must exist
183+
4. **No duplicates:** Each subnet name appears in exactly one address space
184+
5. **Delegation:** Matches Azure requirement for the target service
185+
6. **depends_on chain:** New subnet depends on ALL preceding subnets
186+
7. **NSG at creation:** Included in `azapi_resource` body, not a separate association
187+
8. **Shared stack output:** Subnet ID exposed for cross-stack consumption
188+
9. **Format:** `terraform fmt -recursive` on `modules/network/` and `stacks/shared/`
100189

101190
## Detailed References
102191

0 commit comments

Comments
 (0)