Skip to content

Fix all services failing in AKS: add missing secrets, fix image refs#530

Merged
aurelianware merged 1 commit intomainfrom
claude/fix-portal-rollout-timeout-eTHyI
Mar 20, 2026
Merged

Fix all services failing in AKS: add missing secrets, fix image refs#530
aurelianware merged 1 commit intomainfrom
claude/fix-portal-rollout-timeout-eTHyI

Conversation

@aurelianware
Copy link
Owner

Root cause: nearly all services (0/N replicas) because the deploy workflow only created 3 K8s secrets but services reference 9 distinct secrets.

Secrets added:

  • mongodb-secret (from Cosmos DB connection string, MongoDB API)
  • cosmos-db-secret (endpoint + key for Cosmos DB SDK services)
  • cosmos-config (same, used by trading-partner-service)
  • redis-secret (for claims-service, benefit-plan-service)
  • kafka-secret (for claims-scrubbing-service)
  • azure-storage-secret (for appeals, claims-scrubbing)
  • azure-ad-config: added missing Audience key

Also fixed:

  • sed replacement now handles ghcr.io image refs (6 services)
  • trading-partner-service containerPort 80 → 8080 (matches health probes)

Required GitHub secrets to add:
COSMOS_DB_ENDPOINT, COSMOS_DB_KEY, REDIS_CONNECTION_STRING, KAFKA_SASL_USERNAME, KAFKA_SASL_PASSWORD, AZURE_STORAGE_CONNECTION_STRING, AZURE_AD_AUDIENCE

https://claude.ai/code/session_01A95Uah18uxLJpuAR5HShNS

Root cause: nearly all services (0/N replicas) because the deploy workflow
only created 3 K8s secrets but services reference 9 distinct secrets.

Secrets added:
- mongodb-secret (from Cosmos DB connection string, MongoDB API)
- cosmos-db-secret (endpoint + key for Cosmos DB SDK services)
- cosmos-config (same, used by trading-partner-service)
- redis-secret (for claims-service, benefit-plan-service)
- kafka-secret (for claims-scrubbing-service)
- azure-storage-secret (for appeals, claims-scrubbing)
- azure-ad-config: added missing Audience key

Also fixed:
- sed replacement now handles ghcr.io image refs (6 services)
- trading-partner-service containerPort 80 → 8080 (matches health probes)

Required GitHub secrets to add:
  COSMOS_DB_ENDPOINT, COSMOS_DB_KEY, REDIS_CONNECTION_STRING,
  KAFKA_SASL_USERNAME, KAFKA_SASL_PASSWORD,
  AZURE_STORAGE_CONNECTION_STRING, AZURE_AD_AUDIENCE

https://claude.ai/code/session_01A95Uah18uxLJpuAR5HShNS
@github-actions
Copy link

Code Coverage

Package Line Rate Branch Rate Health
CloudHealthOffice.Portal 13% 3%
CloudHealthOffice.Portal 13% 3%
Summary 13% (2498 / 18662) 3% (174 / 5968)

@aurelianware aurelianware merged commit 30c13bc into main Mar 20, 2026
61 checks passed
@aurelianware aurelianware deleted the claude/fix-portal-rollout-timeout-eTHyI branch March 20, 2026 20:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the AKS deployment workflow and a service manifest to prevent pods from staying at 0/N replicas by ensuring required Kubernetes Secrets exist, normalizing image references to the built ACR images, and aligning the trading-partner-service container port with its probes.

Changes:

  • Add creation of several missing Kubernetes Secrets (MongoDB/Cosmos/Redis/Kafka/Azure Storage) and include Azure AD Audience in the existing Azure AD secret.
  • Update sed substitutions so service manifests using ghcr.io/... image refs are rewritten to ${ACR}/... during deploy.
  • Fix trading-partner-service container port to 8080.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/services/trading-partner-service/k8s/trading-partner-service-deployment.yaml Updates container port to 8080 to match health probes/targetPort.
.github/workflows/deploy-azure-aks.yml Creates additional Secrets required by services, adds Audience to Azure AD secret, and rewrites GHCR image refs to ACR during manifest apply.

Comment on lines +328 to +333
- name: Create MongoDB secret (Cosmos DB MongoDB API)
run: |
kubectl create secret generic mongodb-secret \
--from-literal=connectionString="${{ secrets.COSMOS_DB_CONNECTION_STRING }}" \
-n ${{ env.NAMESPACE }} \
--dry-run=client -o yaml | kubectl apply -f -
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow is interpolating GitHub secrets directly into the shell script (e.g., "${{ secrets.COSMOS_DB_CONNECTION_STRING }}"). If any secret contains characters like $( or backticks, bash will evaluate them during script parsing, and even benign characters can cause quoting/escaping issues. Prefer passing secrets via the step env: block and referencing them as normal shell variables when building kubectl create secret commands (this avoids re-parsing secret contents by the shell).

Copilot uses AI. Check for mistakes.
Comment on lines +335 to +341
- name: Create Cosmos DB endpoint/key secret
run: |
kubectl create secret generic cosmos-db-secret \
--from-literal=endpoint="${{ secrets.COSMOS_DB_ENDPOINT }}" \
--from-literal=key="${{ secrets.COSMOS_DB_KEY }}" \
-n ${{ env.NAMESPACE }} \
--dry-run=client -o yaml | kubectl apply -f -
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These secret-creation steps will still succeed if the required GitHub secrets are missing (GitHub expressions become empty strings), resulting in Kubernetes Secrets with empty values and hard-to-diagnose runtime failures. Add an explicit validation/guard in the run block (or a dedicated step) to fail the job when required values like COSMOS_DB_ENDPOINT/COSMOS_DB_KEY are unset or empty before calling kubectl create secret.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants