Skip to content

Commit 923f0af

Browse files
committed
Add portal SSH tagged LB rule and rename portal-probe-tagged
- Rename portal-probe-tagged to portal-probe-https-tagged for naming consistency (portal has both https and ssh probe types) - Add portal-lbrule-ssh-tagged: maps portal-frontend-tagged:22 to backend:2223 for SSH over the tagged IP path - Add portal-probe-ssh-tagged: TCP health probe on port 2223 - Open firewall port 2223/tcp on VMSS - Add -p 2223:2223 port mapping to the portal container - Update docs with motivation, container routing, cross-repo dependency, and port convention sections
1 parent f03cd0f commit 923f0af

File tree

6 files changed

+142
-23
lines changed

6 files changed

+142
-23
lines changed

docs/load-balancer-tagged-ip.md

Lines changed: 81 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,26 @@
11
# Load Balancer Tagged IP Migration
22

3+
## Release Status
4+
5+
### INT Environment — Successful
6+
7+
| | ARO-RP | RP-Config |
8+
|---|--------|-----------|
9+
| **Repo** | `Azure/ARO-RP` | `Azure/RP-Config` |
10+
| **Branch** | `preetisht/ARO-20087-from-master` | `b-ptripathi/vmss-ip-tags` |
11+
| **Commit** | `f03cd0fc3c76db8acec034bdb12267a2220f428f` | `163e61e5fa09cbb388df700dc4180ce367b74959` |
12+
| **Tag** | `v20260216.01-lb-tagged-ip-dual-frontend` | `v20260216.01-lb-tagged-ip-dual-frontend` |
13+
14+
Released to INT on **2026-02-16**. Deployment completed successfully.
15+
16+
---
17+
18+
## Motivation
19+
20+
Azure public IP addresses can carry **IP tags** — metadata key-value pairs such as `RoutingPreference` and `FirstPartyUsage`. These tags tell Azure's networking fabric how to handle traffic on those IPs (e.g., preferred routing paths, first-party billing attribution). The ARO RP and Portal services were originally deployed with untagged public IPs. To comply with Azure networking requirements for first-party services, we need to migrate to tagged IPs.
21+
22+
Replacing the IPs in-place would be risky: a misconfigured tag or a platform issue could take down the RP or Portal. Instead, we use a **dual-frontend approach** — add new tagged IPs alongside the existing untagged ones, point DNS to the tagged IPs, and keep the untagged IPs as an instant rollback path.
23+
324
## Overview
425

526
This change introduces **tagged public IP addresses** to the ARO RP load balancer using a **dual-frontend approach**. Tagged IPs carry metadata (IP tags) that allow Azure to route traffic with specific properties (e.g., `RoutingPreference` or `FirstPartyUsage`). The existing untagged IPs remain in place, ensuring a safe, rollback-friendly migration.
@@ -22,9 +43,11 @@ Two new tagged public IPs and corresponding frontends are added to the **same**
2243
```
2344
rp-pip (untagged) ──► rp-frontend ──► port 443 ──► backend (8443)
2445
portal-pip (untagged) ──► portal-frontend ──► port 443 ──► backend (444)
46+
portal-pip (untagged) ──► portal-frontend ──► port 22 ──► backend (2222)
2547
2648
rp-pip-tagged (tagged) ──► rp-frontend-tagged ──► port 443 ──► backend (8443)
2749
portal-pip-tagged (tagged) ──► portal-frontend-tagged──► port 443 ──► backend (8444)
50+
portal-pip-tagged (tagged) ──► portal-frontend-tagged──► port 22 ──► backend (2223)
2851
```
2952

3053
DNS records are updated to point at the **tagged** IPs, making them the primary entry point. The untagged IPs serve as a fallback for rollback.
@@ -65,28 +88,46 @@ These are passed as ARM template parameters with empty-array defaults.
6588

6689
#### New Load Balancing Rules
6790

68-
| Rule Name | Frontend | Frontend Port | Backend Port | Purpose |
69-
|----------------------|-----------------------|---------------|--------------|-----------------------------|
70-
| `rp-lbrule-8443` | `rp-frontend` | 8443 | 8443 | RP traffic on untagged IP |
71-
| `portal-lbrule-8444` | `rp-frontend` | 8444 | 8444 | Portal on untagged IP |
72-
| (tagged RP rule) | `rp-frontend-tagged` | 443 | 8443 | RP traffic on tagged IP |
73-
| (tagged Portal rule) | `portal-frontend-tagged` | 443 | 8444 | Portal on tagged IP |
91+
| Rule Name | Frontend | Frontend Port | Backend Port | Purpose |
92+
|----------------------------|----------------------------|---------------|--------------|----------------------------------|
93+
| `rp-lbrule` | `rp-frontend` | 443 | 443 | RP traffic on untagged IP |
94+
| `rp-lbrule-8443` | `rp-frontend-tagged` | 443 | 8443 | RP traffic on tagged IP |
95+
| `portal-lbrule` | `portal-frontend` | 443 | 444 | Portal HTTPS on untagged IP |
96+
| `portal-lbrule-8444` | `portal-frontend-tagged` | 443 | 8444 | Portal HTTPS on tagged IP |
97+
| `portal-lbrule-ssh` | `portal-frontend` | 22 | 2222 | Portal SSH on untagged IP |
98+
| `portal-lbrule-ssh-tagged`| `portal-frontend-tagged` | 22 | 2223 | Portal SSH on tagged IP |
7499

75100
#### New Health Probes
76101

77-
| Probe Name | Port | Protocol | Path |
78-
|-----------------------|------|----------|-----------------|
79-
| `rp-probe-tagged` | 8443 | HTTPS | `/healthz/ready`|
80-
| `portal-probe-tagged` | 8444 | HTTPS | `/healthz/ready`|
102+
| Probe Name | Port | Protocol | Path |
103+
|-----------------------------|------|----------|-----------------|
104+
| `rp-probe-tagged` | 8443 | HTTPS | `/healthz/ready`|
105+
| `portal-probe-https-tagged` | 8444 | HTTPS | `/healthz/ready`|
106+
| `portal-probe-ssh-tagged` | 2223 | TCP ||
81107

82108
Separate probes allow independent health monitoring of tagged vs. untagged paths.
83109

84110
### 4. VMSS and Firewall Changes
85111

86-
- **Firewall ports opened:** `8443/tcp` and `8444/tcp` added to the RP VMSS firewall allow list.
87-
- **Container port mapping:** RP container gets `-p 8443:8443` and `-p 8444:8444` in addition to the existing `-p 443:8443`.
112+
- **Firewall ports opened:** `8443/tcp`, `8444/tcp`, and `2223/tcp` added to the RP VMSS firewall allow list.
113+
- **Container port mapping:** RP container gets `-p 8443:8443` and `-p 8444:8444` in addition to the existing `-p 443:8443`. Portal container gets `-p 2223:2223` in addition to the existing `-p 2222:2222`.
114+
115+
### 5. Container Routing
116+
117+
The VMSS runs multiple containers. Each container handles different traffic types. Understanding which container listens on which port is critical for debugging:
118+
119+
| Container | Service File | Ports | Traffic Type |
120+
|------------------|----------------------------------|--------------------------------------------|---------------------------|
121+
| **aro-rp** | `aro-rp.service` | `443:8443`, `8443:8443`, `8444:8444` | RP API (ARM, Geneva) |
122+
| **aro-portal** | `aro-portal.service` | `444:8444`, `2222:2222`, `2223:2223` | Portal HTTPS and SSH |
88123

89-
### 5. NSG (Network Security Group) Rules
124+
The format is `host_port:container_port`. Key points:
125+
126+
- **RP traffic** (both untagged on 443 and tagged on 8443) routes to the **aro-rp** container, which internally listens on 8443.
127+
- **Portal HTTPS** (untagged on 444, tagged on 8444) routes to the **aro-portal** container, which internally listens on 8444.
128+
- **Portal SSH** (untagged on 2222, tagged on 2223) routes to the **aro-portal** container. The portal process handles SSH tunnelling to cluster nodes.
129+
130+
### 6. NSG (Network Security Group) Rules
90131

91132
Two new inbound rules are added to the RP NSG in both development and production predeploy templates:
92133

@@ -97,7 +138,7 @@ Two new inbound rules are added to the RP NSG in both development and production
97138

98139
Additionally, NSG deployment is now **forced on every predeploy** (not only on initial creation) to ensure new rules are always applied.
99140

100-
### 6. DNS Update
141+
### 7. DNS Update
101142

102143
`deploy_rp.go``configureDNS()` now resolves the tagged IPs instead of the untagged ones:
103144

@@ -111,9 +152,32 @@ This makes the tagged IPs the primary DNS entry point.
111152
Because both tagged and untagged frontends exist on the same load balancer:
112153

113154
1. **DNS rollback** — Point DNS back to the untagged IPs (`rp-pip`, `portal-pip`). Traffic immediately flows through the original, untagged path on port 443.
114-
2. **Health probes** — Separate probes (`rp-probe-tagged`, `portal-probe-tagged`) allow monitoring tagged paths independently. If tagged probes fail, the untagged paths remain unaffected.
155+
2. **Health probes** — Separate probes (`rp-probe-tagged`, `portal-probe-https-tagged`, `portal-probe-ssh-tagged`) allow monitoring tagged paths independently. If tagged probes fail, the untagged paths remain unaffected.
115156
3. **Region disable list** — Add a region to `lbIpTagsDisabledRegions` to create the tagged IPs without tags in that region, effectively making them behave like untagged IPs.
116157

158+
## Cross-Repository Dependency
159+
160+
This feature spans two repositories that must be deployed together:
161+
162+
| Repository | What it provides |
163+
|------------|-----------------|
164+
| **Azure/ARO-RP** | ARM templates with the dual-frontend LB, tagged IP resources, probes, rules, NSG rules, firewall/container config, and DNS logic. |
165+
| **Azure/RP-Config** | Per-environment configuration values: the actual IP tag objects (`rpLbIpTags`, `portalLbIpTags`) and the disabled regions list (`lbIpTagsDisabledRegions`). Without these values, the tagged IPs are created but have no tags applied. |
166+
167+
**Deployment order:** RP-Config should be deployed first (or simultaneously) so that the IP tag values are available when the ARO-RP ARM template is evaluated. If ARO-RP is deployed before RP-Config provides the tag values, the tagged IPs will be created without tags (safe, but defeats the purpose).
168+
169+
## Port Convention
170+
171+
Tagged backend ports follow a predictable offset from their untagged counterparts:
172+
173+
| Service | Untagged Backend Port | Tagged Backend Port | Offset |
174+
|---------------|-----------------------|---------------------|------------|
175+
| RP API | 443 | 8443 | +8000 |
176+
| Portal HTTPS | 444 | 8444 | +8000 |
177+
| Portal SSH | 2222 | 2223 | +1 |
178+
179+
This convention allows the same VMSS to serve both tagged and untagged traffic on distinct ports, with the application process distinguishing traffic origin by the port it arrives on.
180+
117181
## Files Changed
118182

119183
| File | What |
@@ -123,8 +187,8 @@ Because both tagged and untagged frontends exist on the same load balancer:
123187
| `pkg/deploy/generator/templates.go` | Regex-based IP tag injection in template fixup |
124188
| `pkg/deploy/generator/templates_rp.go` | New parameters and resources in RP template |
125189
| `pkg/deploy/generator/resources_rp.go` | LB frontends, rules, probes, and NSG rules |
126-
| `pkg/deploy/generator/scripts/rpVMSS.sh` | Firewall ports 8443, 8444 |
127-
| `pkg/deploy/generator/scripts/util-services.sh` | Container port mappings |
190+
| `pkg/deploy/generator/scripts/rpVMSS.sh` | Firewall ports 8443, 8444, 2223 |
191+
| `pkg/deploy/generator/scripts/util-services.sh` | Container port mappings (RP: 8443, 8444; Portal: 2223) |
128192
| `pkg/deploy/deploy_rp.go` | DNS points to tagged IPs |
129193
| `pkg/deploy/predeploy.go` | Force NSG deployment on every predeploy |
130194
| `pkg/deploy/assets/rp-production.json` | Generated ARM template |

pkg/deploy/assets/gateway-production.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

pkg/deploy/assets/rp-production.json

Lines changed: 29 additions & 3 deletions
Large diffs are not rendered by default.

pkg/deploy/generator/resources_rp.go

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -297,7 +297,7 @@ func (g *generator) rpLB() *arm.Resource {
297297
ID: pointerutils.ToPtr("[resourceId('Microsoft.Network/loadBalancers/backendAddressPools', 'rp-lb', 'rp-backend')]"),
298298
},
299299
Probe: &armnetwork.SubResource{
300-
ID: pointerutils.ToPtr("[resourceId('Microsoft.Network/loadBalancers/probes', 'rp-lb', 'portal-probe-tagged')]"),
300+
ID: pointerutils.ToPtr("[resourceId('Microsoft.Network/loadBalancers/probes', 'rp-lb', 'portal-probe-https-tagged')]"),
301301
},
302302
Protocol: pointerutils.ToPtr(armnetwork.TransportProtocolTCP),
303303
LoadDistribution: pointerutils.ToPtr(armnetwork.LoadDistributionDefault),
@@ -342,6 +342,24 @@ func (g *generator) rpLB() *arm.Resource {
342342
},
343343
Name: pointerutils.ToPtr("portal-lbrule-ssh"),
344344
},
345+
{
346+
Properties: &armnetwork.LoadBalancingRulePropertiesFormat{
347+
FrontendIPConfiguration: &armnetwork.SubResource{
348+
ID: pointerutils.ToPtr("[resourceId('Microsoft.Network/loadBalancers/frontendIPConfigurations', 'rp-lb', 'portal-frontend-tagged')]"),
349+
},
350+
BackendAddressPool: &armnetwork.SubResource{
351+
ID: pointerutils.ToPtr("[resourceId('Microsoft.Network/loadBalancers/backendAddressPools', 'rp-lb', 'rp-backend')]"),
352+
},
353+
Probe: &armnetwork.SubResource{
354+
ID: pointerutils.ToPtr("[resourceId('Microsoft.Network/loadBalancers/probes', 'rp-lb', 'portal-probe-ssh-tagged')]"),
355+
},
356+
Protocol: pointerutils.ToPtr(armnetwork.TransportProtocolTCP),
357+
LoadDistribution: pointerutils.ToPtr(armnetwork.LoadDistributionDefault),
358+
FrontendPort: pointerutils.ToPtr(int32(22)),
359+
BackendPort: pointerutils.ToPtr(int32(2223)),
360+
},
361+
Name: pointerutils.ToPtr("portal-lbrule-ssh-tagged"),
362+
},
345363
},
346364
Probes: []*armnetwork.Probe{
347365
{
@@ -378,7 +396,7 @@ func (g *generator) rpLB() *arm.Resource {
378396
NumberOfProbes: pointerutils.ToPtr(int32(2)),
379397
RequestPath: pointerutils.ToPtr("/healthz/ready"),
380398
},
381-
Name: pointerutils.ToPtr("portal-probe-tagged"),
399+
Name: pointerutils.ToPtr("portal-probe-https-tagged"),
382400
},
383401
{
384402
Properties: &armnetwork.ProbePropertiesFormat{
@@ -388,6 +406,14 @@ func (g *generator) rpLB() *arm.Resource {
388406
},
389407
Name: pointerutils.ToPtr("portal-probe-ssh"),
390408
},
409+
{
410+
Properties: &armnetwork.ProbePropertiesFormat{
411+
Protocol: pointerutils.ToPtr(armnetwork.ProbeProtocolTCP),
412+
Port: pointerutils.ToPtr(int32(2223)),
413+
NumberOfProbes: pointerutils.ToPtr(int32(2)),
414+
},
415+
Name: pointerutils.ToPtr("portal-probe-ssh-tagged"),
416+
},
391417
},
392418
},
393419
Name: pointerutils.ToPtr("rp-lb"),

pkg/deploy/generator/scripts/rpVMSS.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,8 @@ main() {
8585
"445/tcp"
8686
# Portal ssh
8787
"2222/tcp"
88+
# Portal ssh tagged
89+
"2223/tcp"
8890
# JIT ssh
8991
"22/tcp"
9092
)

pkg/deploy/generator/scripts/util-services.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -336,6 +336,7 @@ ExecStart=/usr/bin/podman run \
336336
-m 2g \
337337
-p 444:8444 \
338338
-p 2222:2222 \
339+
-p 2223:2223 \
339340
-v /run/systemd/journal:/run/systemd/journal \
340341
-v /var/etw:/var/etw:z \
341342
-v /var/run/mdsd/asa:/var/run/mdsd/asa:z \

0 commit comments

Comments
 (0)