Skip to content

feat(helm): add Kubernetes NetworkPolicies for Radius components#11558

Open
officialasishkumar wants to merge 2 commits intoradius-project:mainfrom
officialasishkumar:feat/add-kubernetes-networkpolicies
Open

feat(helm): add Kubernetes NetworkPolicies for Radius components#11558
officialasishkumar wants to merge 2 commits intoradius-project:mainfrom
officialasishkumar:feat/add-kubernetes-networkpolicies

Conversation

@officialasishkumar
Copy link
Copy Markdown

Description

Add opt-in Kubernetes NetworkPolicy resources for all Radius components to restrict network traffic according to the principle of least privilege. This addresses multiple threats identified in the UCP threat assessment:

  • THR03: Unauthenticated communication between UCP and resource providers
  • THR08: Kubernetes API server aggregation bypass (UCP reachable via ClusterIP/pod IP)
  • THR10: Metrics and health endpoint information disclosure

Network Traffic Model

When enabled, each component gets a dedicated NetworkPolicy that restricts both ingress and egress:

Component Ingress From Egress To
UCP All Radius components (port 9443), Prometheus (port 9090) Radius components, DNS, K8s API
Controller K8s API server webhooks (port 9443), Prometheus UCP, DNS, K8s API
Applications RP UCP only (ports 5443, 5444), Prometheus Radius components, DNS, K8s API, external cloud APIs
Dynamic RP UCP only (port 8082), Prometheus Radius components, DNS, K8s API, external (Terraform downloads)
Bicep DE UCP only (port 6443) UCP only (callbacks), DNS
Dashboard Any (user-facing, port 7007), Prometheus UCP (API calls), DNS
Database UCP, applications-rp, dynamic-rp (port 5432) DNS only

Configuration

NetworkPolicies are disabled by default for backward compatibility and for clusters without a CNI plugin that supports NetworkPolicy:

global:
  networkPolicy:
    enabled: false  # Set to true to enable

Prometheus metrics scraping rules are automatically included when global.prometheus.enabled: true (the default).

Changes

  • deploy/Chart/values.yaml: Add global.networkPolicy.enabled toggle (default: false)
  • deploy/Chart/templates/*/networkpolicy.yaml: Add NetworkPolicy manifests for UCP, controller, applications-rp, dynamic-rp, bicep-de, dashboard, and database
  • deploy/Chart/tests/networkpolicy_test.yaml: Add Helm unit tests (10 test cases) covering enablement, conditional rendering, port correctness, pod selectors, and labels

Type of change

  • This pull request adds or changes features of Radius and has an approved issue (issue link required).

Fixes: #11271

Contributor checklist

Please verify that the PR meets the following requirements, where applicable:

  • An overview of proposed schema changes is included in a linked GitHub issue.
    • Yes
    • Not applicable
  • A design document PR is created in the design-notes repository, if new APIs are being introduced.
    • Yes
    • Not applicable
  • The design document has been reviewed and approved by Radius maintainers/approvers.
    • Yes
    • Not applicable
  • A PR for the samples repository is created, if existing samples are affected by the changes in this PR.
    • Yes
    • Not applicable
  • A PR for the documentation repository is created, if the changes in this PR affect the documentation or any user facing updates are made.
    • Yes
    • Not applicable
  • A PR for the recipes repository is created, if existing recipes are affected by the changes in this PR.
    • Yes
    • Not applicable

Add opt-in NetworkPolicy resources for all Radius components to restrict
network traffic according to the principle of least privilege. This
addresses threats THR03 (unauthenticated inter-component communication),
THR08 (API server aggregation bypass), and THR10 (metrics endpoint
information disclosure) from the UCP threat assessment.

Policies are disabled by default (global.networkPolicy.enabled: false)
for backward compatibility and for clusters without a CNI that supports
NetworkPolicy.

When enabled, each component gets a dedicated NetworkPolicy:
- UCP: accepts traffic from all Radius components, egress to RPs and
  Kubernetes API
- Controller: accepts webhook traffic, egress to UCP and Kubernetes API
- Applications RP: accepts traffic from UCP only on ports 5443/5444
- Dynamic RP: accepts traffic from UCP only on port 8082
- Bicep DE: accepts traffic from UCP only, egress only to UCP
- Dashboard: accepts user traffic, egress to UCP
- Database: accepts traffic from UCP and RPs only (when database is
  enabled)

All policies include DNS egress and conditional Prometheus metrics
ingress rules.

Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
Copilot AI review requested due to automatic review settings April 5, 2026 19:00
@officialasishkumar officialasishkumar requested review from a team as code owners April 5, 2026 19:00
@officialasishkumar officialasishkumar requested a deployment to external-contributor-approval April 5, 2026 19:01 — with GitHub Actions Waiting
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in global.networkPolicy.enabled Helm value and corresponding Kubernetes NetworkPolicy resources to restrict ingress/egress between Radius components (and optionally metrics scraping), plus Helm unit tests to validate rendering and key fields.

Changes:

  • Introduce global.networkPolicy.enabled (default: false) in Helm values.
  • Add NetworkPolicy templates for UCP, controller, applications-rp, dynamic-rp, bicep-de, dashboard, and database.
  • Add Helm unit tests covering enablement/disablement and basic spec assertions.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
deploy/Chart/values.yaml Adds the global NetworkPolicy enablement toggle.
deploy/Chart/templates/ucp/networkpolicy.yaml Defines UCP ingress/egress rules under the toggle.
deploy/Chart/templates/controller/networkpolicy.yaml Defines controller ingress/egress rules under the toggle.
deploy/Chart/templates/rp/networkpolicy.yaml Defines applications-rp ingress/egress rules under the toggle.
deploy/Chart/templates/dynamic-rp/networkpolicy.yaml Defines dynamic-rp ingress/egress rules under the toggle.
deploy/Chart/templates/de/networkpolicy.yaml Defines bicep-de ingress/egress rules under the toggle.
deploy/Chart/templates/dashboard/networkpolicy.yaml Defines dashboard ingress/egress rules under the toggle.
deploy/Chart/templates/database/networkpolicy.yaml Defines database ingress/egress rules under the toggle and database.enabled.
deploy/Chart/tests/networkpolicy_test.yaml Adds helm-unittest coverage for template rendering and key fields.

Comment on lines +18 to +22
# Allow ingress from all Radius components (controller, DE, RPs)
- from:
- podSelector:
matchLabels:
app.kubernetes.io/part-of: radius
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCP is registered as an aggregated APIService (templates/ucp/apiservice.yaml), so the Kubernetes API server must be able to reach the UCP Service/Pods. This ingress rule only allows pods labeled app.kubernetes.io/part-of=radius, which won’t match API server traffic, so enabling NetworkPolicy can break the aggregated API. Add an explicit allow for the API server (ideally via configurable CIDRs/namespace+pod selectors depending on cluster type).

Copilot uses AI. Check for mistakes.
port: 9443
{{- if .Values.global.prometheus.enabled }}
# Allow Prometheus metrics scraping
- ports:
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the stated goal of limiting metrics/health info disclosure (THR10). Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).

Suggested change
- ports:
- from:
- namespaceSelector: {}
podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
ports:

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +49
- ipBlock:
cidr: 0.0.0.0/0
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This egress rule uses ipBlock 0.0.0.0/0 on ports 443/6443, which effectively allows egress to any destination on those ports (not just the Kubernetes API server). If the intent is least-privilege API server access, make allowed API server CIDRs configurable (or otherwise scope this rule).

Suggested change
- ipBlock:
cidr: 0.0.0.0/0
{{- range .Values.global.networkPolicy.kubernetesApiServerCidrs }}
- ipBlock:
cidr: {{ . | quote }}
{{- end }}

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +2
{{- if .Values.global.networkPolicy.enabled }}
apiVersion: networking.k8s.io/v1
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetworkPolicy and Prometheus conditionals use bare truthiness checks (e.g., if .Values.global.prometheus.enabled). Existing Deployment templates use if eq .Values.global.prometheus.enabled true, which avoids treating string values like "false" as enabled. Align the conditionals to prevent surprising differences between rendered Deployments vs NetworkPolicies.

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +22
# Allow ingress from UCP only (deployment engine receives proxied requests)
- from:
- podSelector:
matchLabels:
app.kubernetes.io/name: ucp
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Prometheus enabled, the DE exposes metrics on port 6443 (see templates/de/deployment.yaml prometheus.io/port). This NetworkPolicy only allows ingress from the UCP podSelector, so Prometheus scraping will be blocked when NetworkPolicy is enabled. Add an ingress allowance for Prometheus when metrics are enabled, or document that DE metrics aren’t supported under NetworkPolicy.

Copilot uses AI. Check for mistakes.
set:
global.networkPolicy.enabled: true
global.prometheus.enabled: false
template: de/networkpolicy.yaml
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn’t validate Prometheus-conditional rendering: de/networkpolicy.yaml has no Prometheus-specific ingress block, so spec.ingress length will be 1 regardless of global.prometheus.enabled. Point this test at a template that conditionally renders the Prometheus rule (e.g., ucp/controller/rp/dynamic-rp/dashboard) and assert presence/absence of the metrics ingress entry.

Suggested change
template: de/networkpolicy.yaml
template: dashboard/networkpolicy.yaml

Copilot uses AI. Check for mistakes.
port: 5444
{{- if .Values.global.prometheus.enabled }}
# Allow Prometheus metrics scraping
- ports:
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).

Suggested change
- ports:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: "{{ .Release.Namespace }}"
podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
ports:

Copilot uses AI. Check for mistakes.
port: 8082
{{- if .Values.global.prometheus.enabled }}
# Allow Prometheus metrics scraping
- ports:
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).

Suggested change
- ports:
- from:
- podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
ports:

Copilot uses AI. Check for mistakes.
port: 9443
{{- if .Values.global.prometheus.enabled }}
# Allow Prometheus metrics scraping
- ports:
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).

Suggested change
- ports:
- from:
- namespaceSelector: {}
podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
ports:

Copilot uses AI. Check for mistakes.
port: {{ .Values.dashboard.containerPort }}
{{- if .Values.global.prometheus.enabled }}
# Allow Prometheus metrics scraping
- ports:
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prometheus ingress rule omits a from selector, which allows metrics access from any source in the cluster. That undermines the intent of restricting metrics/health endpoint exposure. Restrict this rule to Prometheus pods/namespaces (and consider making the selector configurable).

Suggested change
- ports:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: "{{ .Release.Namespace }}"
podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
ports:

Copilot uses AI. Check for mistakes.
@officialasishkumar officialasishkumar temporarily deployed to external-contributor-approval April 5, 2026 19:16 — with GitHub Actions Inactive
@radius-functional-tests
Copy link
Copy Markdown

radius-functional-tests bot commented Apr 6, 2026

Radius functional test overview

🔍 Go to test action run

Click here to see the test run details
Name Value
Repository officialasishkumar/radius
Commit ref 4506aee
Unique ID func6358c5ac79
Image tag pr-func6358c5ac79
  • gotestsum 1.13.0
  • KinD: v0.29.0
  • Dapr: 1.14.4
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.3.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-func6358c5ac79
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-func6358c5ac79
  • dynamic-rp test image location: ghcr.io/radius-project/dev/dynamic-rp:pr-func6358c5ac79
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-func6358c5ac79
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-func6358c5ac79
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting corerp-cloud functional tests...
⌛ Starting ucp-cloud functional tests...
✅ ucp-cloud functional tests succeeded
✅ corerp-cloud functional tests succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security: Add Kubernetes NetworkPolicies for UCP and Radius components

2 participants