Skip to content

feat (in-cluster): [webapprouting] replace traefik with nginx addon#446

Open
ferantivero wants to merge 27 commits intomspnp:mainfrom
ferantivero:feature/494723_application-routing-addon-nginx
Open

feat (in-cluster): [webapprouting] replace traefik with nginx addon#446
ferantivero wants to merge 27 commits intomspnp:mainfrom
ferantivero:feature/494723_application-routing-addon-nginx

Conversation

@ferantivero
Copy link
Contributor

@ferantivero ferantivero commented Oct 22, 2025

WHY

we wanted to experiment with the NGINX addon (web app routing) that could replace the existing ingress controller in the AKS Baseline Reference Implementation (Traefik). This way we can remove one of the "manual" dependencies we took, streamline the deployment process and use built-in features in AKS.

WHAT Changed?

  • enable the built-in web application routing profile configured default as internal type
  • integrate with akv
  • integrate with pdz
  • create a second ingress internal ingress controller instance to configure load balancer preferences, such as desired subnet
  • configured NGINX ingress object
  • streamlined deployment steps
  • minor improvements
  • minor bug fixes and get rid of unsupported fields from the latest managed cluster resource provider api version
  • removed traefik
  • Merge with the default and second ingress controller into a single one

TEST

tested e2e and we plan to test this once again after addressing feedback + upon approval

highights:

- configure the application routing add-on to automatically create records
on the Ingress private DNS zones
…ancer

highlights:
- configure nginx with an ILB
- update the managed cluster api version
Copilot AI review requested due to automatic review settings October 22, 2025 18:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the manually deployed Traefik ingress controller with the AKS-managed NGINX Web App Routing addon to streamline the deployment process. The change enables a built-in AKS feature for ingress management with integrated Azure Key Vault and Private DNS Zone support.

Key Changes:

  • Enabled the built-in Web App Routing addon with NGINX ingress controller configured for internal load balancing
  • Integrated managed identity for Key Vault certificate access and Private DNS Zone record management
  • Removed all Traefik-related resources and dependencies (deployment manifests, CSI provider configuration, workload identity setup)

Reviewed Changes

Copilot reviewed 18 out of 20 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
workload/traefik.yaml Removed entire Traefik ingress controller manifest including ServiceAccount, RBAC, ConfigMap, Service, and Deployment
workload/02-aspnetapp-ingress.yaml Updated ingress resource to use nginx-internal class with Key Vault certificate integration and NGINX-specific annotations
workload/01-aspnetapp.yaml Modified pod security context, added health probes, updated affinity rules to target NGINX ingress controller, added NET_BIND_SERVICE capability
workload-team/cluster-stamp.bicep Enabled webAppRouting profile, configured DNS zone integration, removed podmi-ingress-controller resources, added role assignments for web app routing managed identity, upgraded API version to 2025-07-02-preview
cluster-manifests/a0008/nginx-internal.yaml Added new NginxIngressController custom resource defining internal load balancer configuration
cluster-manifests/a0008/ingress-network-policy.yaml Updated network policy to allow traffic from app-routing-system namespace and nginx-internal ingress controller pods
workload-team/modules/policies.bicep Removed Traefik-related policy violation comments and updated resource limit comments
docs/* Updated documentation to reflect NGINX ingress controller, removed Traefik installation steps, updated navigation links

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 35 to 36
cat nginx-iternal-ingress-controller-tls.crt nginx-iternal-ingress-controller-tls.key > nginx-iternal-ingress-controller-tls.pem
export INGRESS_CONTROLLER_KV_CERT_URI=$(az keyvault certificate import -f nginx-iternal-ingress-controller-tls.pem -n nginx-iternal-ingress-controller-tls --vault-name $KEYVAULT_NAME_AKS_BASELINE --query id -o tsv)
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'iternal' to 'internal' in certificate filename.

Suggested change
cat nginx-iternal-ingress-controller-tls.crt nginx-iternal-ingress-controller-tls.key > nginx-iternal-ingress-controller-tls.pem
export INGRESS_CONTROLLER_KV_CERT_URI=$(az keyvault certificate import -f nginx-iternal-ingress-controller-tls.pem -n nginx-iternal-ingress-controller-tls --vault-name $KEYVAULT_NAME_AKS_BASELINE --query id -o tsv)
cat nginx-internal-ingress-controller-tls.crt nginx-internal-ingress-controller-tls.key > nginx-internal-ingress-controller-tls.pem
export INGRESS_CONTROLLER_KV_CERT_URI=$(az keyvault certificate import -f nginx-internal-ingress-controller-tls.pem -n nginx-internal-ingress-controller-tls --vault-name $KEYVAULT_NAME_AKS_BASELINE --query id -o tsv)

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done | addressed from f825b73


```bash
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -out traefik-ingress-internal-aks-ingress-tls.crt -keyout traefik-ingress-internal-aks-ingress-tls.key -subj "/CN=*.aks-ingress.${DOMAIN_NAME_AKS_BASELINE}/O=Contoso AKS Ingress"
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -out nginx-iternal-ingress-controller-tls.crt -keyout nginx-iternal-ingress-controller-tls.key -subj "/CN=*.aks-ingress.${DOMAIN_NAME_AKS_BASELINE}/O=Contoso AKS Ingress"
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'iternal' to 'internal' in certificate filename.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done | addressed from f825b73

az role assignment delete --ids $TEMP_ROLEASSIGNMENT_TO_UPLOAD_CERT
```

## Check internal NGINX ingress controller is up and runnning
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'runnning' to 'running'.

Suggested change
## Check internal NGINX ingress controller is up and runnning
## Check internal NGINX ingress controller is up and running

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done | addressed from f825b73

securityContext:
runAsUser: 10001
runAsGroup: 3000
securityContext: {}
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pod-level securityContext is now empty but container-level security settings (runAsNonRoot, runAsUser, runAsGroup) are defined at lines 77-79. Consider whether pod-level fsGroup setting should be retained for consistent group ownership of volumes, or document why it was intentionally removed.

Suggested change
securityContext: {}
securityContext:
fsGroup: 3000

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done | addressed from 5f0d237

}

// Built-in Azure RBAC role that is applied a Key Vault to grant with metadata, certificates, keys and secrets read privileges. Granted to App Gateway's managed identity.
// Built-in Azure RBAC role that is applied a Key Vault to grant with metadata, certificates, keys and secrets read privileges. Granted to App Gateway's managed identity and our web app routing profile's managed identiy.
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'identiy' to 'identity'.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done | addressed from f825b73

}

// Built-in Azure RBAC role that is applied to a Key Vault to grant with secrets content read privileges. Granted to both Key Vault and our workload's identity.
// Built-in Azure RBAC role that is applied to a Key Vault to grant with secrets content read privileges. Granted to our web app routing profile's managed identiy.
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'identiy' to 'identity'.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done | addressed from f825b73

scope: subscription()
}

// Built-in Azure RBAC role that is applied to a Private DNS Zone to grant with contributor privileges. Granted our web app routing profile's managed identiy.
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'identiy' to 'identity'.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done | addressed from f825b73

highlights:

- CIS Benchmarks and most cluster-hardening guides recommend non-root
  users + fsGroup to ensure the principle of least privilege and
  writable volumes.
- fsGroup is not redudant since the workload is not configured ad read-only.
- shared value btw fsGroup and runAsGroup (primary process GID) is fine for
  simple cases. Process can read/write volume files without needing
  additional permission adjustments. The
  process’s own group is enough to access its volumes. In other words,
  same group ID governs both “who I am” and “what I can write.”
Co-authored-by: John Downs <john@johndowns.co.nz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants