Skip to content

Claude/fix portal rollout timeout e t hy i#532

Merged
aurelianware merged 5 commits intomainfrom
claude/fix-portal-rollout-timeout-eTHyI
Mar 21, 2026
Merged

Claude/fix portal rollout timeout e t hy i#532
aurelianware merged 5 commits intomainfrom
claude/fix-portal-rollout-timeout-eTHyI

Conversation

@aurelianware
Copy link
Owner

No description provided.

claude added 4 commits March 21, 2026 02:10
Lower both the deployment replicas and HPA minReplicas to 1 to avoid
overwhelming a free-tier CosmosDB with concurrent connections.

https://claude.ai/code/session_01A95Uah18uxLJpuAR5HShNS
Scale down reference-data-service from 2 to 1 replicas (and HPA min
from 2 to 1), and portal HPA min from 2 to 1 to reduce resource
usage during initial rollout.

https://claude.ai/code/session_01A95Uah18uxLJpuAR5HShNS
encounter-service already had a MongoDB repository implementation
(EncounterRepositoryMongo) with conditional logic in Program.cs.
Updated its k8s deployment to provide MongoDb__ConnectionString
from mongodb-secret instead of CosmosDB endpoint/key, so the
service will use the MongoDB code path.

claims-scrubbing-service was fully refactored from @azure/cosmos
SDK to the mongodb driver:
- Replaced CosmosClient with MongoClient
- Replaced Container queries with MongoDB Collection find/insertOne
- Replaced Cosmos TTL with MongoDB expireAt field
- Updated health check to use db.command({ ping: 1 })
- Updated config types, env vars, and k8s deployment

Both services now use the shared mongodb-secret, eliminating the
need for a separate CosmosDB NoSQL account.

https://claude.ai/code/session_01A95Uah18uxLJpuAR5HShNS
Copilot AI review requested due to automatic review settings March 21, 2026 03:27
@github-actions
Copy link

Code Coverage

Package Line Rate Branch Rate Health
CloudHealthOffice.Portal 13% 3%
CloudHealthOffice.Portal 13% 3%
Summary 13% (2498 / 18662) 3% (174 / 5968)

@aurelianware aurelianware merged commit 6064394 into main Mar 21, 2026
59 checks passed
@aurelianware aurelianware deleted the claude/fix-portal-rollout-timeout-eTHyI branch March 21, 2026 03:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Kubernetes deployment settings to reduce baseline replica counts and migrates the claims-scrubbing service configuration/runtime from Cosmos DB to MongoDB (plus aligns encounter-service’s k8s manifest with MongoDB settings).

Changes:

  • Reduce replicas / HPA minReplicas for portal, encounter-service, and reference-data-service.
  • Migrate claims-scrubbing-service from Cosmos DB to MongoDB (types/config, runtime client, k8s env vars, dependency swap).
  • Update encounter-service k8s to remove Cosmos DB config and use mongodb-secret.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/services/reference-data-service/k8s/reference-data-service-deployment.yaml Lowers Deployment replicas and HPA minReplicas.
src/services/encounter-service/k8s/encounter-service-deployment.yaml Switches manifest from Cosmos DB env vars to MongoDB and lowers replica counts.
src/services/claims-scrubbing-service/src/types.ts Renames config/health types from Cosmos DB to MongoDB.
src/services/claims-scrubbing-service/src/index.ts Loads MongoDB config from env vars and gates initialization on MongoDB connection string.
src/services/claims-scrubbing-service/src/claims-scrubber.ts Replaces Cosmos DB client usage with MongoDB driver for rules/audit storage + health check.
src/services/claims-scrubbing-service/package.json Removes @azure/cosmos, adds mongodb dependency.
src/services/claims-scrubbing-service/k8s/claims-scrubbing-service-deployment.yaml Updates ConfigMap/env vars from COSMOS_* to MONGODB_*.
src/portal/CloudHealthOffice.Portal/k8s/portal-deployment.yaml Lowers portal HPA minReplicas.

ASPNETCORE_ENVIRONMENT: "Production"
CosmosDb__DatabaseName: "CloudHealthOffice"
CosmosDb__ContainerName: "Encounters"
MongoDb__DatabaseName: "CloudHealthOffice"
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MongoDb__DatabaseName is set to "CloudHealthOffice" here, while most other service deployments use "cloudhealthoffice". Because MongoDB database names are distinct by string, this can cause the encounter-service to read/write from a different database than the rest of the platform. Align the DB name with the shared convention unless intentional.

Suggested change
MongoDb__DatabaseName: "CloudHealthOffice"
MongoDb__DatabaseName: "cloudhealthoffice"

Copilot uses AI. Check for mistakes.
Comment on lines 19 to 21
spec:
replicas: 3
replicas: 1
selector:
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reduces the Deployment replica count from 3 to 1 (and HPA minReplicas to 1). If this manifest is used for production, a single replica reduces availability during node drain/rollouts and can cause brief downtime. If the goal is to avoid rollout timeouts due to capacity, consider keeping minReplicas >= 2 and adjusting resources/rollout strategy instead, or document that this manifest is for dev/staging.

Copilot uses AI. Check for mistakes.
Comment on lines 92 to 94
spec:
replicas: 2
replicas: 1
selector:
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lowers replicas and HPA minReplicas from 2 to 1. For a production deployment this removes redundancy and increases risk of downtime during rollouts or node maintenance. If the goal is to address rollout timeouts/capacity, consider keeping minReplicas at 2 and adjusting resource requests/limits or rollout parameters instead, or document that this manifest targets non-prod environments.

Copilot uses AI. Check for mistakes.
kind: Deployment
name: portal
minReplicas: 2 # Safe with Redis-backed session + token cache
minReplicas: 1 # Safe with Redis-backed session + token cache
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reducing portal HPA minReplicas from 2 to 1 removes redundancy; a single replica means any rollout/node disruption causes full portal downtime. If this change is to mitigate rollout timeouts due to cluster capacity, consider keeping minReplicas >= 2 for production and addressing the scheduling issue via resources/cluster sizing/rollout strategy, or document that this manifest is intended for smaller non-prod clusters.

Suggested change
minReplicas: 1 # Safe with Redis-backed session + token cache
minReplicas: 2 # Production redundancy; use separate non-prod manifest if running a single replica

Copilot uses AI. Check for mistakes.
auditContainerName: process.env.COSMOS_AUDIT_CONTAINER || 'validation-audit',
mongoDb: {
connectionString: process.env.MONGODB_CONNECTION_STRING || '',
databaseName: process.env.MONGODB_DATABASE || 'CloudHealthOffice',
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting mongoDb.databaseName to "CloudHealthOffice" (and setting the ConfigMap to the same) is inconsistent with most other service manifests, which use "cloudhealthoffice". MongoDB treats database names as distinct strings, so this can split data across two DBs. Consider standardizing the default/configured DB name to the shared value used elsewhere.

Suggested change
databaseName: process.env.MONGODB_DATABASE || 'CloudHealthOffice',
databaseName: process.env.MONGODB_DATABASE || 'cloudhealthoffice',

Copilot uses AI. Check for mistakes.
Comment on lines +402 to +406
timestamp: new Date(),
expireAt: new Date(Date.now() + 90 * 24 * 60 * 60 * 1000), // 90 days TTL
};

await this.auditContainer.items.create(auditRecord);
await this.auditCollection.insertOne(auditRecord);
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expireAt is being written for a 90-day TTL, but MongoDB will only delete documents automatically if the collection has a TTL index on that field. Consider creating a TTL index on expireAt during initialization (or provisioning it explicitly), otherwise audit records will accumulate indefinitely.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to 102
minReplicas: 1
maxReplicas: 20
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lowering HPA minReplicas to 1 reduces baseline availability and can introduce downtime during maintenance/rollouts. If this is intended as a capacity/rollout-timeout mitigation, consider adding a note about environment scope (dev/staging) or keeping minReplicas >= 2 for production HA.

Copilot uses AI. Check for mistakes.
COSMOS_DATABASE: "CloudHealthOffice"
COSMOS_RULES_CONTAINER: "ScrubRules"
COSMOS_AUDIT_CONTAINER: "ScrubAudit"
MONGODB_DATABASE: "CloudHealthOffice"
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MONGODB_DATABASE is set to "CloudHealthOffice" here, but most other services use "cloudhealthoffice". If these services are expected to share a single Mongo database, this will create/use a different DB and lead to missing data. Align the configured DB name with the rest of the stack unless the separation is intentional and documented.

Suggested change
MONGODB_DATABASE: "CloudHealthOffice"
MONGODB_DATABASE: "cloudhealthoffice"

Copilot uses AI. Check for mistakes.
"@azure/identity": "^4.13.0",
"@azure/keyvault-secrets": "^4.9.0",
"@azure/storage-blob": "^12.30.0",
"mongodb": "^6.12.0",
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package.json adds mongodb and removes @azure/cosmos, but the repo’s package-lock.json still contains @azure/cosmos and does not include mongodb. This will likely break npm ci/reproducible builds. Regenerate and commit the updated lockfile so dependencies match package.json.

Suggested change
"mongodb": "^6.12.0",
"@azure/cosmos": "^4.0.0",

Copilot uses AI. Check for mistakes.
Comment on lines 20 to 57
import { autoConfigureKeyVault, commonSecretMappings } from './keyvault-config';

// Auto-configure Key Vault on startup (before loading configuration)
// Note: This is a module-level flag suitable for Node.js single-threaded execution
// The service is designed to be initialized once on startup, not concurrently
let keyVaultConfigured = false;

// Default configuration (can be overridden by environment variables)
// Configuration is loaded after Key Vault auto-configuration
function getConfig(): ClaimsScrubberConfig {
return {
kafka: {
bootstrapServers: process.env.KAFKA_BOOTSTRAP_SERVERS || 'localhost:9092',
clientId: process.env.KAFKA_CLIENT_ID || 'claims-scrubber',
inboundTopic: process.env.INBOUND_CLAIMS_TOPIC || 'claims-inbound',
cleanClaimsTopic: process.env.CLEAN_CLAIMS_TOPIC || 'claims-adjudication',
flaggedClaimsTopic: process.env.FLAGGED_CLAIMS_TOPIC || 'claims-work-queue',
rejectedClaimsTopic: process.env.REJECTED_CLAIMS_TOPIC || 'claims-rejected',
consumerGroupId: process.env.KAFKA_CONSUMER_GROUP || 'claims-scrubber-group',
sasl: process.env.KAFKA_SASL_USERNAME ? {
mechanism: (process.env.KAFKA_SASL_MECHANISM || 'scram-sha-512') as 'plain' | 'scram-sha-256' | 'scram-sha-512',
username: process.env.KAFKA_SASL_USERNAME,
password: process.env.KAFKA_SASL_PASSWORD || '',
} : undefined,
ssl: process.env.KAFKA_SSL === 'true',
},
storage: {
accountName: process.env.STORAGE_ACCOUNT_NAME,
connectionString: process.env.STORAGE_CONNECTION_STRING,
containerName: process.env.CLAIMS_CONTAINER || 'claims-archive',
archivePathPattern: '{claimType}/{status}/{yyyy}/{MM}/{dd}',
},
cosmosDb: {
endpoint: process.env.COSMOS_ENDPOINT || '',
databaseName: process.env.COSMOS_DATABASE || 'claims-scrubbing',
rulesContainerName: process.env.COSMOS_RULES_CONTAINER || 'validation-rules',
auditContainerName: process.env.COSMOS_AUDIT_CONTAINER || 'validation-audit',
mongoDb: {
connectionString: process.env.MONGODB_CONNECTION_STRING || '',
databaseName: process.env.MONGODB_DATABASE || 'CloudHealthOffice',
rulesCollectionName: process.env.MONGODB_RULES_COLLECTION || 'ScrubRules',
auditCollectionName: process.env.MONGODB_AUDIT_COLLECTION || 'ScrubAudit',
},
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service now reads MONGODB_CONNECTION_STRING (and other MONGODB_* vars), but startup still calls autoConfigureKeyVault(commonSecretMappings) and commonSecretMappings currently only maps Cosmos/Storage/Kafka secrets. In environments relying on Key Vault auto-configuration, MongoDB won’t be configured and svc.initialize() will be skipped. Add MongoDB secret mappings (or a Mongo-specific mapping list) so Key Vault can populate MONGODB_* variables.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants