feat(helm): add Kubernetes NetworkPolicies for Radius components#11558
feat(helm): add Kubernetes NetworkPolicies for Radius components#11558officialasishkumar wants to merge 2 commits intoradius-project:mainfrom
Conversation
Add opt-in NetworkPolicy resources for all Radius components to restrict network traffic according to the principle of least privilege. This addresses threats THR03 (unauthenticated inter-component communication), THR08 (API server aggregation bypass), and THR10 (metrics endpoint information disclosure) from the UCP threat assessment. Policies are disabled by default (global.networkPolicy.enabled: false) for backward compatibility and for clusters without a CNI that supports NetworkPolicy. When enabled, each component gets a dedicated NetworkPolicy: - UCP: accepts traffic from all Radius components, egress to RPs and Kubernetes API - Controller: accepts webhook traffic, egress to UCP and Kubernetes API - Applications RP: accepts traffic from UCP only on ports 5443/5444 - Dynamic RP: accepts traffic from UCP only on port 8082 - Bicep DE: accepts traffic from UCP only, egress only to UCP - Dashboard: accepts user traffic, egress to UCP - Database: accepts traffic from UCP and RPs only (when database is enabled) All policies include DNS egress and conditional Prometheus metrics ingress rules. Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
There was a problem hiding this comment.
Pull request overview
Adds an opt-in global.networkPolicy.enabled Helm value and corresponding Kubernetes NetworkPolicy resources to restrict ingress/egress between Radius components (and optionally metrics scraping), plus Helm unit tests to validate rendering and key fields.
Changes:
- Introduce
global.networkPolicy.enabled(default:false) in Helm values. - Add
NetworkPolicytemplates for UCP, controller, applications-rp, dynamic-rp, bicep-de, dashboard, and database. - Add Helm unit tests covering enablement/disablement and basic spec assertions.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| deploy/Chart/values.yaml | Adds the global NetworkPolicy enablement toggle. |
| deploy/Chart/templates/ucp/networkpolicy.yaml | Defines UCP ingress/egress rules under the toggle. |
| deploy/Chart/templates/controller/networkpolicy.yaml | Defines controller ingress/egress rules under the toggle. |
| deploy/Chart/templates/rp/networkpolicy.yaml | Defines applications-rp ingress/egress rules under the toggle. |
| deploy/Chart/templates/dynamic-rp/networkpolicy.yaml | Defines dynamic-rp ingress/egress rules under the toggle. |
| deploy/Chart/templates/de/networkpolicy.yaml | Defines bicep-de ingress/egress rules under the toggle. |
| deploy/Chart/templates/dashboard/networkpolicy.yaml | Defines dashboard ingress/egress rules under the toggle. |
| deploy/Chart/templates/database/networkpolicy.yaml | Defines database ingress/egress rules under the toggle and database.enabled. |
| deploy/Chart/tests/networkpolicy_test.yaml | Adds helm-unittest coverage for template rendering and key fields. |
| # Allow ingress from all Radius components (controller, DE, RPs) | ||
| - from: | ||
| - podSelector: | ||
| matchLabels: | ||
| app.kubernetes.io/part-of: radius |
There was a problem hiding this comment.
UCP is registered as an aggregated APIService (templates/ucp/apiservice.yaml), so the Kubernetes API server must be able to reach the UCP Service/Pods. This ingress rule only allows pods labeled app.kubernetes.io/part-of=radius, which won’t match API server traffic, so enabling NetworkPolicy can break the aggregated API. Add an explicit allow for the API server (ideally via configurable CIDRs/namespace+pod selectors depending on cluster type).
| port: 9443 | ||
| {{- if .Values.global.prometheus.enabled }} | ||
| # Allow Prometheus metrics scraping | ||
| - ports: |
There was a problem hiding this comment.
The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the stated goal of limiting metrics/health info disclosure (THR10). Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).
| - ports: | |
| - from: | |
| - namespaceSelector: {} | |
| podSelector: | |
| matchLabels: | |
| app.kubernetes.io/name: prometheus | |
| ports: |
| - ipBlock: | ||
| cidr: 0.0.0.0/0 |
There was a problem hiding this comment.
This egress rule uses ipBlock 0.0.0.0/0 on ports 443/6443, which effectively allows egress to any destination on those ports (not just the Kubernetes API server). If the intent is least-privilege API server access, make allowed API server CIDRs configurable (or otherwise scope this rule).
| - ipBlock: | |
| cidr: 0.0.0.0/0 | |
| {{- range .Values.global.networkPolicy.kubernetesApiServerCidrs }} | |
| - ipBlock: | |
| cidr: {{ . | quote }} | |
| {{- end }} |
| {{- if .Values.global.networkPolicy.enabled }} | ||
| apiVersion: networking.k8s.io/v1 |
There was a problem hiding this comment.
NetworkPolicy and Prometheus conditionals use bare truthiness checks (e.g., if .Values.global.prometheus.enabled). Existing Deployment templates use if eq .Values.global.prometheus.enabled true, which avoids treating string values like "false" as enabled. Align the conditionals to prevent surprising differences between rendered Deployments vs NetworkPolicies.
| # Allow ingress from UCP only (deployment engine receives proxied requests) | ||
| - from: | ||
| - podSelector: | ||
| matchLabels: | ||
| app.kubernetes.io/name: ucp |
There was a problem hiding this comment.
With Prometheus enabled, the DE exposes metrics on port 6443 (see templates/de/deployment.yaml prometheus.io/port). This NetworkPolicy only allows ingress from the UCP podSelector, so Prometheus scraping will be blocked when NetworkPolicy is enabled. Add an ingress allowance for Prometheus when metrics are enabled, or document that DE metrics aren’t supported under NetworkPolicy.
| set: | ||
| global.networkPolicy.enabled: true | ||
| global.prometheus.enabled: false | ||
| template: de/networkpolicy.yaml |
There was a problem hiding this comment.
This test doesn’t validate Prometheus-conditional rendering: de/networkpolicy.yaml has no Prometheus-specific ingress block, so spec.ingress length will be 1 regardless of global.prometheus.enabled. Point this test at a template that conditionally renders the Prometheus rule (e.g., ucp/controller/rp/dynamic-rp/dashboard) and assert presence/absence of the metrics ingress entry.
| template: de/networkpolicy.yaml | |
| template: dashboard/networkpolicy.yaml |
| port: 5444 | ||
| {{- if .Values.global.prometheus.enabled }} | ||
| # Allow Prometheus metrics scraping | ||
| - ports: |
There was a problem hiding this comment.
The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).
| - ports: | |
| - from: | |
| - namespaceSelector: | |
| matchLabels: | |
| kubernetes.io/metadata.name: "{{ .Release.Namespace }}" | |
| podSelector: | |
| matchLabels: | |
| app.kubernetes.io/name: prometheus | |
| ports: |
| port: 8082 | ||
| {{- if .Values.global.prometheus.enabled }} | ||
| # Allow Prometheus metrics scraping | ||
| - ports: |
There was a problem hiding this comment.
The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).
| - ports: | |
| - from: | |
| - podSelector: | |
| matchLabels: | |
| app.kubernetes.io/name: prometheus | |
| ports: |
| port: 9443 | ||
| {{- if .Values.global.prometheus.enabled }} | ||
| # Allow Prometheus metrics scraping | ||
| - ports: |
There was a problem hiding this comment.
The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).
| - ports: | |
| - from: | |
| - namespaceSelector: {} | |
| podSelector: | |
| matchLabels: | |
| app.kubernetes.io/name: prometheus | |
| ports: |
| port: {{ .Values.dashboard.containerPort }} | ||
| {{- if .Values.global.prometheus.enabled }} | ||
| # Allow Prometheus metrics scraping | ||
| - ports: |
There was a problem hiding this comment.
The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).
| - ports: | |
| - from: | |
| - namespaceSelector: | |
| matchLabels: | |
| kubernetes.io/metadata.name: "{{ .Release.Namespace }}" | |
| podSelector: | |
| matchLabels: | |
| app.kubernetes.io/name: prometheus | |
| ports: |
Radius functional test overviewClick here to see the test run details
Test Status⌛ Building Radius and pushing container images for functional tests... |
Description
Add opt-in Kubernetes NetworkPolicy resources for all Radius components to restrict network traffic according to the principle of least privilege. This addresses multiple threats identified in the UCP threat assessment:
Network Traffic Model
When enabled, each component gets a dedicated NetworkPolicy that restricts both ingress and egress:
Configuration
NetworkPolicies are disabled by default for backward compatibility and for clusters without a CNI plugin that supports NetworkPolicy:
Prometheus metrics scraping rules are automatically included when
global.prometheus.enabled: true(the default).Changes
deploy/Chart/values.yaml: Addglobal.networkPolicy.enabledtoggle (default: false)deploy/Chart/templates/*/networkpolicy.yaml: Add NetworkPolicy manifests for UCP, controller, applications-rp, dynamic-rp, bicep-de, dashboard, and databasedeploy/Chart/tests/networkpolicy_test.yaml: Add Helm unit tests (10 test cases) covering enablement, conditional rendering, port correctness, pod selectors, and labelsType of change
Fixes: #11271
Contributor checklist
Please verify that the PR meets the following requirements, where applicable: