Skip to content

feat: cron to enqueue inactive organizations for deletion#2418

Open
wilsonrivera wants to merge 30 commits intomainfrom
wilson/eng-7753-delete-inactive-organizations
Open

feat: cron to enqueue inactive organizations for deletion#2418
wilsonrivera wants to merge 30 commits intomainfrom
wilson/eng-7753-delete-inactive-organizations

Conversation

@wilsonrivera
Copy link
Copy Markdown
Contributor

@wilsonrivera wilsonrivera commented Dec 16, 2025

The goal of this PR is to introduce a script to enqueue the deletion of inactive organizations and send a message so owners can prevent the deletion before the time runs out.

Currently only organizations that have been inactive for more than 3 months and only have a single member are considered for deletion.

Summary by CodeRabbit

  • New Features

    • Admins are emailed when an organization’s deletion is queued.
    • Automated detection and queuing of inactive single‑user organizations for scheduled deletion.
  • Improvements

    • Deletion flow now runs inside a transaction with audit logging and optional notifications.
    • Per‑request configurable deletion delay supported.
  • Chores

    • Background queue orchestration expanded and job scheduling wired end‑to‑end.
    • Logging metadata keys standardized; raw-body plugin configuration adjusted; dependency updates applied.

Checklist

  • I have discussed my proposed changes in an issue and have received approval to proceed.
  • I have followed the coding standards of the project.
  • Tests or benchmarks have been added or updated.
  • Documentation has been updated on https://github.com/wundergraph/cosmo-docs.
  • I have read the Contributors Guide.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 16, 2025

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds two BullMQ queues/workers (notify deletion queued, queue inactive-org deletions), wires them into build-server with mailer/keycloak/db/redis, makes org-deletion scheduling configurable, wraps deleteOrganization flow in a DB transaction, normalizes stalled-job log keys, and updates bullmq/ioredis versions.

Changes

Cohort / File(s) Summary
New notification worker & queue
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts
Adds NotifyOrganizationDeletionQueuedInput, NotifyOrganizationDeletionQueuedQueue, and createNotifyOrganizationDeletionQueuedWorker; worker loads org and admins, formats timestamps, and sends deletion-queued emails via mailer.
New inactive-orgs queue & worker
controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts
Adds QueueInactiveOrganizationsDeletionInput, QueueInactiveOrganizationsDeletionQueue, and createQueueInactiveOrganizationsDeletionWorker; finds inactive single-user orgs, validates via audit logs/sessions/Keycloak, and enqueues deletion + notification jobs (includes scheduleJob).
Build server wiring
controlplane/src/core/build-server.ts
Instantiates new queues with fastify.redisForQueue, registers both new workers (passing redisConnection, db, logger, mailer, realm, keycloak, deleteOrganizationQueue, notifyOrganizationDeletionQueuedQueue), calls queueInactiveOrganizationsDeletionQueue.scheduleJob(), updates raw-body plugin options, and simplifies metrics import.
Repository: deletion scheduling API
controlplane/src/core/repositories/OrganizationRepository.ts
queueOrganizationDeletion signature updated to accept optional deleteDelayInDays?, removed transaction wrapper, computes delay from input or default, and returns the queued job result.
Transactional delete flow
controlplane/src/core/bufservices/organization/deleteOrganization.ts
Wraps delete flow in a DB transaction, constructs repositories with tx, enforces admin/min-org-count checks, records audit log, queues deletion with computed dates, and optionally sends notification emails.
Stalled-job log key normalization
controlplane/src/core/workers/*
controlplane/src/core/workers/CacheWarmerWorker.ts, .../DeactivateOrganizationWorker.ts, .../DeleteOrganizationAuditLogsWorker.ts, .../DeleteOrganizationWorker.ts, .../DeleteUserQueue.ts, .../ReactivateOrganizationWorker.ts
Replaces joinId with jobId in stalled-event log payloads across multiple workers; no behavioral changes.
Dependency updates
controlplane/package.json
Pins/updates bullmq to 5.66.4 and ioredis to 5.8.2 (caret removed).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: cron to enqueue inactive organizations for deletion' accurately and concisely summarizes the main feature being added—a scheduled cron job to queue inactive organizations for deletion.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
controlplane/src/bin/db-cleanup.ts (2)

88-101: Remove @ts-ignore and simplify chunk logic.

The @ts-ignore suppresses a type error for a condition that's always false (MAX_DEGREE_OF_PARALLELISM === 1 when it's defined as 5). Consider removing the special case or making it configurable if the single-threaded path is needed.

 function chunkArray<T>(data: T[]): T[][] {
-  // @ts-ignore
-  if (MAX_DEGREE_OF_PARALLELISM === 1) {
-    return [data];
-  }
-
   const chunks: T[][] = [];
   const organizationsPerChunk = Math.ceil(ORGANIZATIONS_PER_BUCKET / MAX_DEGREE_OF_PARALLELISM);
   for (let i = 0; i < data.length; i += organizationsPerChunk) {
     chunks.push(data.slice(i, i + organizationsPerChunk));
   }
-
   return chunks;
 }

117-117: Consider using the pino logger consistently.

A pino logger is created on line 62, but the script uses console.log/console.error for output (lines 117, 134, 138, 141, 156, 189). For consistency with the rest of the codebase and better structured logging, consider using the pino logger throughout.

Also applies to: 138-138, 141-141, 156-156, 189-189

controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (2)

86-87: Use getOrganizationAdmins instead of filtering all members.

OrganizationRepository has a getOrganizationAdmins method that directly returns admin members. This avoids loading RBAC data for all members just to filter.

-    const organizationMembers = await orgRepo.getMembers({ organizationID: org.id });
-    const orgAdmins = organizationMembers.filter((m) => m.rbac.isOrganizationAdmin);
+    const orgAdmins = await orgRepo.getOrganizationAdmins({ organizationID: org.id });

100-100: Avoid direct process.env access; pass webBaseUrl via configuration.

The worker accesses process.env.WEB_BASE_URL directly, which is inconsistent with other workers that receive configuration through their input options. This also makes testing harder.

+// In createNotifyOrganizationDeletionQueuedWorker input:
+  webBaseUrl: string;

 // In handler:
-        restoreLink: `${process.env.WEB_BASE_URL}/${org.slug}/settings`,
+        restoreLink: `${this.input.webBaseUrl}/${org.slug}/settings`,
controlplane/src/core/build-server.ts (1)

406-411: Pass webBaseUrl to the worker for consistency.

The worker uses process.env.WEB_BASE_URL directly, but opts.auth.webBaseUrl is available here. Pass it to maintain consistency with how other components receive configuration.

   createNotifyOrganizationDeletionQueuedWorker({
     redisConnection: fastify.redisForWorker,
     db: fastify.db,
     logger,
     mailer: mailerClient,
+    webBaseUrl: opts.auth.webBaseUrl,
   }),
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between caaf2bd and 4804e55.

📒 Files selected for processing (5)
  • controlplane/src/bin/db-cleanup.ts (1 hunks)
  • controlplane/src/core/build-server.ts (4 hunks)
  • controlplane/src/core/repositories/OrganizationRepository.ts (2 hunks)
  • controlplane/src/core/routes.ts (2 hunks)
  • controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-29T10:28:04.846Z
Learnt from: JivusAyrus
Repo: wundergraph/cosmo PR: 2156
File: controlplane/src/core/repositories/SubgraphRepository.ts:1749-1751
Timestamp: 2025-08-29T10:28:04.846Z
Learning: In the controlplane codebase, authentication and authorization checks (including organization scoping) are handled at the service layer in files like unlinkSubgraph.ts before calling repository methods. Repository methods like unlinkSubgraph() in SubgraphRepository.ts can focus purely on data operations without redundant security checks.

Applied to files:

  • controlplane/src/bin/db-cleanup.ts
🧬 Code graph analysis (5)
controlplane/src/core/repositories/OrganizationRepository.ts (1)
controlplane/src/core/constants.ts (1)
  • delayForManualOrgDeletionInDays (10-10)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (3)
controlplane/src/core/workers/Worker.ts (2)
  • IQueue (3-7)
  • IWorker (9-11)
controlplane/src/core/services/Mailer.ts (1)
  • Mailer (13-101)
controlplane/src/core/repositories/OrganizationRepository.ts (1)
  • OrganizationRepository (50-1681)
controlplane/src/bin/db-cleanup.ts (5)
controlplane/src/core/plugins/redis.ts (1)
  • createRedisConnections (29-86)
controlplane/src/core/workers/DeleteOrganizationWorker.ts (1)
  • DeleteOrganizationQueue (20-62)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1)
  • NotifyOrganizationDeletionQueuedQueue (18-60)
controlplane/src/db/schema.ts (2)
  • organizations (1266-1289)
  • auditLogs (1936-1972)
controlplane/src/core/repositories/OrganizationRepository.ts (1)
  • OrganizationRepository (50-1681)
controlplane/src/core/routes.ts (1)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1)
  • NotifyOrganizationDeletionQueuedQueue (18-60)
controlplane/src/core/build-server.ts (1)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (2)
  • NotifyOrganizationDeletionQueuedQueue (18-60)
  • createNotifyOrganizationDeletionQueuedWorker (112-134)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build_test
  • GitHub Check: build_push_image
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (6)
controlplane/src/core/repositories/OrganizationRepository.ts (1)

942-970: LGTM! Clean extension of deletion scheduling.

The optional deleteDelayInDays parameter provides flexibility for different deletion workflows while maintaining backward compatibility by falling back to delayForManualOrgDeletionInDays.

controlplane/src/core/routes.ts (1)

24-24: LGTM! Consistent queue wiring.

The new queue follows the established pattern for queue registration in RouterOptions.

Also applies to: 52-52

controlplane/src/bin/db-cleanup.ts (2)

114-116: Clarify the startOfMonth usage for inactivity threshold.

Using startOfMonth(subDays(now, MIN_INACTIVITY_DAYS)) creates a threshold that varies depending on the current day of the month. For example, running on Dec 16 vs Dec 1 yields different threshold dates. Was this intentional for batch alignment, or should it simply be subDays(now, MIN_INACTIVITY_DAYS)?


49-54: The review comment is incorrect. The code properly handles Redis TLS configuration:

  1. redis.host is never undefined due to the default value process.env.REDIS_HOST || 'localhost' in get-config.ts, making the ! assertion on line 50 redundant but harmless.

  2. The tls property is correctly handled as optional. In get-config.ts, it's conditionally set to an object only when TLS environment variables are present, otherwise undefined. The RedisPluginOptions interface in redis.ts properly defines tls as optional with all its properties optional.

  3. createRedisConnections safely checks each TLS property before use (lines 45, 49, 54 in redis.ts), validating file paths before reading.

No validation is needed because TLS configuration is properly optional and defensively implemented.

Likely an incorrect or invalid review comment.

controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1)

18-60: LGTM! Queue implementation follows established patterns.

The queue configuration with exponential backoff and job retention is consistent with other queues in the codebase.

controlplane/src/core/build-server.ts (1)

401-412: LGTM! Queue and worker wiring follows established patterns.

The new notification queue and worker are registered consistently with other queues in the build server, using the same connection patterns and passing the mailer client appropriately.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (2)

76-78: Throwing error when mailer not configured causes unnecessary retries.

When the mailer is not configured, throwing an error triggers the retry mechanism (6 attempts with exponential backoff). This will cause the job to fail repeatedly for approximately 1.3 hours before giving up, wasting resources.

Consider logging a warning and returning early instead:

     if (!this.input.mailer) {
-      throw new Error('Mailer service not configured');
+      this.input.logger.warn('Mailer service not configured, skipping notification');
+      return;
     }

87-101: Handle case where organization has no admins.

If orgAdmins is empty (line 87), the email will be sent with an empty receiverEmails array (line 95). Consider adding a guard to skip the email or log a warning:

     const orgAdmins = organizationMembers.filter((m) => m.rbac.isOrganizationAdmin);
+
+    if (orgAdmins.length === 0) {
+      this.input.logger.warn({ organizationId: org.id }, 'No admins found for organization, skipping notification');
+      return;
+    }

     const intl = Intl.DateTimeFormat(undefined, {
🧹 Nitpick comments (3)
controlplane/src/core/workers/ReactivateOrganizationWorker.ts (1)

115-116: LGTM! Typo fix aligns logging with other workers.

The change from joinId to jobId corrects a typo and ensures consistent logging across workers.

Optional: Rename the parameter for clarity

The job parameter in the stalled event callback is actually the job ID string, not a Job object. Consider renaming it to jobId for clarity:

-  worker.on('stalled', (job) => {
-    log.warn({ jobId: job }, `Job stalled`);
+  worker.on('stalled', (jobId) => {
+    log.warn({ jobId }, `Job stalled`);
   });

Based on learnings, this change aligns with similar updates across other workers in the PR.

controlplane/src/core/workers/DeactivateOrganizationWorker.ts (1)

127-129: LGTM! Key name fix improves consistency.

The change from joinId to jobId correctly aligns the log key with the actual value being logged and standardizes the approach across workers.

Optional: Rename parameter for clarity

The callback parameter job receives the jobId string (per BullMQ's stalled event signature), not a Job object. Consider renaming it to jobId for clarity:

-  worker.on('stalled', (job) => {
-    log.warn({ jobId: job }, `Job stalled`);
+  worker.on('stalled', (jobId) => {
+    log.warn({ jobId }, `Job stalled`);
   });
controlplane/src/core/workers/CacheWarmerWorker.ts (1)

133-135: Good typo fix; consider logging just the job ID for consistency.

The key name correction from joinId to jobId is excellent and aligns with the broader pattern across workers.

For consistency with the error handler on line 112 (which logs jobId: job.id), consider logging just the job ID rather than the entire job object. This keeps logs concise and matches the established pattern in this file.

🔎 Optional refinement for consistency:
-  log.warn({ jobId: job }, `Job stalled`);
+  log.warn({ jobId: job.id }, `Job stalled`);
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4804e55 and a0c978a.

📒 Files selected for processing (11)
  • controlplane/src/bin/db-cleanup.ts (1 hunks)
  • controlplane/src/core/bufservices/organization/deleteOrganization.ts (1 hunks)
  • controlplane/src/core/repositories/OrganizationRepository.ts (1 hunks)
  • controlplane/src/core/workers/CacheWarmerWorker.ts (1 hunks)
  • controlplane/src/core/workers/DeactivateOrganizationWorker.ts (1 hunks)
  • controlplane/src/core/workers/DeleteOrganizationAuditLogsWorker.ts (1 hunks)
  • controlplane/src/core/workers/DeleteOrganizationWorker.ts (1 hunks)
  • controlplane/src/core/workers/DeleteUserQueue.ts (1 hunks)
  • controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1 hunks)
  • controlplane/src/core/workers/ReactivateOrganizationWorker.ts (1 hunks)
  • controlplane/test/test-util.ts (3 hunks)
✅ Files skipped from review due to trivial changes (1)
  • controlplane/src/core/workers/DeleteOrganizationAuditLogsWorker.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • controlplane/src/bin/db-cleanup.ts
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-29T10:28:04.846Z
Learnt from: JivusAyrus
Repo: wundergraph/cosmo PR: 2156
File: controlplane/src/core/repositories/SubgraphRepository.ts:1749-1751
Timestamp: 2025-08-29T10:28:04.846Z
Learning: In the controlplane codebase, authentication and authorization checks (including organization scoping) are handled at the service layer in files like unlinkSubgraph.ts before calling repository methods. Repository methods like unlinkSubgraph() in SubgraphRepository.ts can focus purely on data operations without redundant security checks.

Applied to files:

  • controlplane/src/core/bufservices/organization/deleteOrganization.ts
📚 Learning: 2025-07-01T13:53:54.146Z
Learnt from: wilsonrivera
Repo: wundergraph/cosmo PR: 1919
File: controlplane/src/core/repositories/OrganizationGroupRepository.ts:193-224
Timestamp: 2025-07-01T13:53:54.146Z
Learning: In the Cosmo codebase, database transactions are typically managed at the service layer (e.g., in buf services like deleteOrganizationGroup.ts), where repositories are instantiated with the transaction handle and all operations within those repositories are automatically part of the same transaction.

Applied to files:

  • controlplane/src/core/bufservices/organization/deleteOrganization.ts
🧬 Code graph analysis (2)
controlplane/src/core/repositories/OrganizationRepository.ts (2)
controlplane/src/core/workers/DeleteOrganizationWorker.ts (1)
  • DeleteOrganizationQueue (20-62)
controlplane/src/core/constants.ts (1)
  • delayForManualOrgDeletionInDays (10-10)
controlplane/test/test-util.ts (1)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1)
  • NotifyOrganizationDeletionQueuedQueue (18-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: build_push_image
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
controlplane/src/core/workers/DeleteUserQueue.ts (1)

143-143: LGTM: Logging key corrected.

The change from joinId to jobId correctly aligns the log metadata with the actual job identifier, consistent with similar corrections across other workers in this PR.

controlplane/src/core/workers/DeleteOrganizationWorker.ts (1)

154-154: LGTM: Logging key corrected.

The change from joinId to jobId correctly aligns the log metadata with the actual job identifier, matching the pattern applied across other workers.

controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1)

128-128: LGTM: Logging key is correct.

The stalled event handler correctly uses jobId for the log metadata.

controlplane/src/core/bufservices/organization/deleteOrganization.ts (1)

28-140: Transaction scope correctly established.

The transaction wrapper correctly ensures all repository operations execute atomically. The repository instantiation with the transaction handle (lines 29-31) follows the established pattern for this codebase.

However, note that queueOrganizationDeletion (lines 100-104) performs a queue operation (addJob) that cannot be rolled back if a subsequent operation within this transaction fails. This is acceptable for notification-type queues but worth documenting.

Based on learnings, the transactional approach at the service layer is appropriate for coordinating multiple repository operations.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
controlplane/src/bin/db-cleanup.ts (1)

149-161: Critical: Nested transaction and Redis atomicity issues remain unaddressed.

This is the same issue previously flagged: the outer transaction at line 151 wraps processChunkOfOrganizations, but within that function, orgRepo.queueOrganizationDeletion (line 199) starts its own transaction, creating nested transactions. Additionally, the Redis notification job (lines 207-211) is enqueued outside the DB transaction boundary, risking inconsistency if the transaction fails after the job is enqueued.

Required fixes:

  1. Refactor orgRepo.queueOrganizationDeletion to accept and use the transaction context (tx) passed from the outer transaction, avoiding nested transactions
  2. Move Redis job enqueueing inside the repository method or implement an outbox pattern to ensure the notification is only sent if the DB transaction commits successfully

Run the following script to verify the transaction handling in OrganizationRepository.queueOrganizationDeletion:

#!/bin/bash
# Check if queueOrganizationDeletion starts its own transaction
ast-grep --pattern $'queueOrganizationDeletion($$$) {
  $$$
  transaction($$$)
  $$$
}'
🧹 Nitpick comments (1)
controlplane/src/bin/db-cleanup.ts (1)

56-72: Consider removing unused redisWorker connection.

The script connects and pings both redisQueue and redisWorker, but only redisQueue is used to initialize the queue instances. The redisWorker connection appears unnecessary for this script.

🔎 Proposed simplification
-  const { redisQueue, redisWorker } = await createRedisConnections({
+  const { redisQueue } = await createRedisConnections({
     host: redis.host!,
     port: Number(redis.port),
     password: redis.password,
     tls: redis.tls,
   });
 
   await redisQueue.connect();
-  await redisWorker.connect();
-  await redisWorker.ping();
   await redisQueue.ping();
 
   // ... rest of code ...
 
   } finally {
     redisQueue.disconnect();
-    redisWorker.disconnect();
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0c978a and e82d01a.

📒 Files selected for processing (3)
  • controlplane/src/bin/db-cleanup.ts (1 hunks)
  • controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1 hunks)
  • controlplane/test/test-util.ts (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • controlplane/test/test-util.ts
  • controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-29T10:28:04.846Z
Learnt from: JivusAyrus
Repo: wundergraph/cosmo PR: 2156
File: controlplane/src/core/repositories/SubgraphRepository.ts:1749-1751
Timestamp: 2025-08-29T10:28:04.846Z
Learning: In the controlplane codebase, authentication and authorization checks (including organization scoping) are handled at the service layer in files like unlinkSubgraph.ts before calling repository methods. Repository methods like unlinkSubgraph() in SubgraphRepository.ts can focus purely on data operations without redundant security checks.

Applied to files:

  • controlplane/src/bin/db-cleanup.ts
📚 Learning: 2025-07-01T13:53:54.146Z
Learnt from: wilsonrivera
Repo: wundergraph/cosmo PR: 1919
File: controlplane/src/core/repositories/OrganizationGroupRepository.ts:193-224
Timestamp: 2025-07-01T13:53:54.146Z
Learning: In the Cosmo codebase, database transactions are typically managed at the service layer (e.g., in buf services like deleteOrganizationGroup.ts), where repositories are instantiated with the transaction handle and all operations within those repositories are automatically part of the same transaction.

Applied to files:

  • controlplane/src/bin/db-cleanup.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build_push_image
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
controlplane/src/bin/db-cleanup.ts (2)

128-135: LGTM! Proper filtering for inactive organizations.

The WHERE clause correctly filters for organizations that:

  • Are not already queued for deletion
  • Are not deactivated (avoiding duplicates)
  • Were created before the inactivity threshold
  • Have no billing plan or are on the developer plan

This addresses the previous review concerns about excluding deactivated organizations and checking billing plans.


184-194: LGTM! Proper inactivity verification.

The audit log check correctly verifies whether an organization has had any activity within the inactivity window. Organizations with recent activity are appropriately skipped, ensuring only truly inactive organizations are enqueued for deletion.

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 18, 2025

Codecov Report

❌ Patch coverage is 15.89041% with 307 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.95%. Comparing base (0b0d42d) to head (7439a22).

Files with missing lines Patch % Lines
controlplane/src/bin/delete-inactive-orgs.ts 0.00% 139 Missing and 1 partial ⚠️
...ore/bufservices/organization/deleteOrganization.ts 0.00% 89 Missing ⚠️
.../workers/NotifyOrganizationDeletionQueuedWorker.ts 30.09% 72 Missing ⚠️
controlplane/src/core/workers/CacheWarmerWorker.ts 0.00% 1 Missing ⚠️
...e/src/core/workers/DeactivateOrganizationWorker.ts 0.00% 1 Missing ⚠️
.../core/workers/DeleteOrganizationAuditLogsWorker.ts 0.00% 1 Missing ⚠️
...plane/src/core/workers/DeleteOrganizationWorker.ts 0.00% 1 Missing ⚠️
controlplane/src/core/workers/DeleteUserQueue.ts 0.00% 1 Missing ⚠️
...e/src/core/workers/ReactivateOrganizationWorker.ts 0.00% 1 Missing ⚠️

❌ Your patch check has failed because the patch coverage (15.89%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2418       +/-   ##
===========================================
- Coverage   64.45%   46.95%   -17.50%     
===========================================
  Files         306     1058      +752     
  Lines       43621   143916   +100295     
  Branches     4690     9649     +4959     
===========================================
+ Hits        28114    67572    +39458     
- Misses      15485    74595    +59110     
- Partials       22     1749     +1727     
Files with missing lines Coverage Δ
controlplane/src/core/build-server.ts 74.13% <100.00%> (+0.56%) ⬆️
...ne/src/core/repositories/OrganizationRepository.ts 77.34% <100.00%> (-0.04%) ⬇️
controlplane/src/core/workers/CacheWarmerWorker.ts 0.00% <0.00%> (ø)
...e/src/core/workers/DeactivateOrganizationWorker.ts 54.73% <0.00%> (ø)
.../core/workers/DeleteOrganizationAuditLogsWorker.ts 87.50% <0.00%> (ø)
...plane/src/core/workers/DeleteOrganizationWorker.ts 90.26% <0.00%> (ø)
controlplane/src/core/workers/DeleteUserQueue.ts 47.52% <0.00%> (ø)
...e/src/core/workers/ReactivateOrganizationWorker.ts 80.89% <0.00%> (ø)
.../workers/NotifyOrganizationDeletionQueuedWorker.ts 30.09% <30.09%> (ø)
...ore/bufservices/organization/deleteOrganization.ts 1.86% <0.00%> (-0.04%) ⬇️
... and 1 more

... and 750 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
controlplane/src/bin/db-cleanup.ts (2)

122-122: Remove unused userId field from query.

The userId field is selected but never used in processChunkOfOrganizations. Consider removing it to slightly reduce query overhead.

🔎 Proposed fix
     .select({
       id: schema.organizations.id,
       slug: schema.organizations.slug,
-      userId: schema.organizations.createdBy,
       plan: schema.organizationBilling.plan,
     })

And update the type signature on line 173:

-  organizations: { id: string; slug: string; userId: string | null }[];
+  organizations: { id: string; slug: string }[];

182-182: Reuse the parent logger for consistency.

Creating a new pino() logger here instead of reusing the logger from line 62 leads to inconsistent logging across the script. The queue constructors (lines 67-68) receive the parent logger, but the repository receives a separate instance.

🔎 Proposed fix

Pass the logger as a parameter to processChunkOfOrganizations:

 async function processChunkOfOrganizations({
   organizations,
   db,
   inactivityThreshold,
   deleteOrganizationQueue,
   notifyOrganizationDeletionQueuedQueue,
+  logger,
 }: {
   organizations: { id: string; slug: string; userId: string | null }[];
   db: PostgresJsDatabase<typeof schema>;
   inactivityThreshold: Date;
   deleteOrganizationQueue: DeleteOrganizationQueue;
   notifyOrganizationDeletionQueuedQueue: NotifyOrganizationDeletionQueuedQueue;
+  logger: pino.Logger;
 }) {
   const queuedAt = new Date();
   const deletesAt = addDays(queuedAt, DELAY_FOR_ORG_DELETION_IN_DAYS);

-  const orgRepo = new OrganizationRepository(pino(), db, undefined);
+  const orgRepo = new OrganizationRepository(logger, db, undefined);

And update the call site around line 152:

         return processChunkOfOrganizations({
           organizations: chunk,
           db: tx,
           inactivityThreshold,
           deleteOrganizationQueue,
           notifyOrganizationDeletionQueuedQueue,
+          logger,
         });
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e82d01a and 67c0a22.

📒 Files selected for processing (1)
  • controlplane/src/bin/db-cleanup.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-29T10:28:04.846Z
Learnt from: JivusAyrus
Repo: wundergraph/cosmo PR: 2156
File: controlplane/src/core/repositories/SubgraphRepository.ts:1749-1751
Timestamp: 2025-08-29T10:28:04.846Z
Learning: In the controlplane codebase, authentication and authorization checks (including organization scoping) are handled at the service layer in files like unlinkSubgraph.ts before calling repository methods. Repository methods like unlinkSubgraph() in SubgraphRepository.ts can focus purely on data operations without redundant security checks.

Applied to files:

  • controlplane/src/bin/db-cleanup.ts
📚 Learning: 2025-07-01T13:53:54.146Z
Learnt from: wilsonrivera
Repo: wundergraph/cosmo PR: 1919
File: controlplane/src/core/repositories/OrganizationGroupRepository.ts:193-224
Timestamp: 2025-07-01T13:53:54.146Z
Learning: In the Cosmo codebase, database transactions are typically managed at the service layer (e.g., in buf services like deleteOrganizationGroup.ts), where repositories are instantiated with the transaction handle and all operations within those repositories are automatically part of the same transaction.

Applied to files:

  • controlplane/src/bin/db-cleanup.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: build_test
  • GitHub Check: build_push_image
🔇 Additional comments (2)
controlplane/src/bin/db-cleanup.ts (2)

17-28: Well-defined configuration constants.

The constants are clearly documented and provide sensible defaults:

  • 3-month inactivity threshold gives adequate time before considering deletion
  • 7-day deletion delay provides a reasonable recovery window
  • Parallelism and batch size are appropriate for a background cleanup script

128-138: LGTM! Query correctly filters target organizations.

The selection criteria are well-implemented:

  • Excludes already-queued and deactivated organizations (avoiding duplicates)
  • Properly scopes to free-tier plans (developer or null)
  • Correctly identifies single-member organizations via HAVING clause

This addresses the plan-check concern and prevents re-queueing deactivated organizations.

@wilsonrivera wilsonrivera requested review from a team as code owners January 7, 2026 02:25
@wilsonrivera wilsonrivera requested a review from Noroth January 7, 2026 02:25
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 7, 2026

Router image scan failed

❌ Security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-10e0ea72bf32f183b6ed4fe1f10d6ce2a5d55f3d

Please check the security vulnerabilities found in the PR.

If you believe this is a false positive, please add the vulnerability to the .trivyignore file and re-run the scan.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts:
- Around line 129-133: In QueueInactiveOrganizationsDeletionWorker, the audit
activity check is currently bypassed because the `continue` is commented out;
restore the original behavior by uncommenting the `continue` inside the
conditional that checks `if (auditLogs.length > 0 && auditLogs[0].count > 0)` so
that organizations with recent audit activity are skipped from deletion
processing, and optionally add a short debug log (using the existing logger)
inside that branch to indicate the org was skipped due to recent audit logs.
- Around line 180-202: The query currently selects
schema.organizations.createdBy as userId but that field may not correspond to
the lone remaining member; update retrieveOrganizationsWithSingleUser to join
the organizationsMembers table (alias it, e.g. om) and select om.userId as the
remaining member ID instead of schema.organizations.createdBy, ensuring the join
used in the FROM/innerJoin includes om and that om.userId is included in the
GROUP BY (or aggregated) so the HAVING COUNT(...) = 1 still works and returns
the actual remaining member.
🧹 Nitpick comments (4)
controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts (2)

22-22: Convert empty interface to a type alias.

Per static analysis, an empty interface is equivalent to {}. Use a type alias for clarity and to follow best practices.

🔎 Proposed fix
-export interface QueueInactiveOrganizationsDeletionInput {}
+export type QueueInactiveOrganizationsDeletionInput = Record<string, never>;

215-222: Consider reducing concurrency for scheduled job.

This worker processes a single scheduled job hourly. A concurrency of 10 is excessive and won't provide any benefit. Consider setting concurrency: 1 to match the expected workload.

🔎 Proposed fix
   const worker = new Worker<QueueInactiveOrganizationsDeletionInput>(
     QueueName,
     (job) => new QueueInactiveOrganizationsDeletionWorker(input).handler(job),
     {
       connection: input.redisConnection,
-      concurrency: 10,
+      concurrency: 1,
     },
   );
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (2)

105-105: Avoid direct process.env access; inject via config.

Accessing process.env.WEB_BASE_URL directly breaks the configuration injection pattern used elsewhere. Consider passing webBaseUrl through the worker's input options for consistency and testability.


94-97: Specify a consistent locale for date formatting.

Using Intl.DateTimeFormat(undefined, ...) relies on the system's default locale, which can produce inconsistent date formats across environments. Consider using a fixed locale (e.g., 'en-US') or allowing the locale to be configured.

🔎 Proposed fix
-      const intl = Intl.DateTimeFormat(undefined, {
+      const intl = Intl.DateTimeFormat('en-US', {
         dateStyle: 'medium',
         timeStyle: 'short',
       });
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67c0a22 and d7cd912.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (4)
  • controlplane/package.json
  • controlplane/src/core/build-server.ts
  • controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts
  • controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts
🧰 Additional context used
🧬 Code graph analysis (3)
controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts (5)
controlplane/src/core/workers/Worker.ts (1)
  • IQueue (3-7)
controlplane/src/core/repositories/OrganizationRepository.ts (1)
  • OrganizationRepository (50-1679)
controlplane/src/core/workers/DeleteOrganizationWorker.ts (1)
  • DeleteOrganizationQueue (20-62)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (1)
  • NotifyOrganizationDeletionQueuedQueue (18-60)
controlplane/src/db/schema.ts (1)
  • auditLogs (1936-1972)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (4)
controlplane/src/core/workers/Worker.ts (2)
  • IQueue (3-7)
  • IWorker (9-11)
controlplane/src/core/routes.ts (1)
  • opts (62-67)
controlplane/src/core/services/Mailer.ts (1)
  • Mailer (13-101)
controlplane/src/core/repositories/OrganizationRepository.ts (1)
  • OrganizationRepository (50-1679)
controlplane/src/core/build-server.ts (2)
controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (2)
  • NotifyOrganizationDeletionQueuedQueue (18-60)
  • createNotifyOrganizationDeletionQueuedWorker (117-139)
controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts (2)
  • QueueInactiveOrganizationsDeletionQueue (24-80)
  • createQueueInactiveOrganizationsDeletionWorker (205-231)
🪛 Biome (2.1.2)
controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts

[error] 22-22: An empty interface is equivalent to {}.

Safe fix: Use a type alias instead.

(lint/suspicious/noEmptyInterface)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: build_test
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: integration_test (./events)
  • GitHub Check: build_push_image
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: build_test
  • GitHub Check: build_push_image
  • GitHub Check: build_test
  • GitHub Check: build_push_image
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)
🔇 Additional comments (8)
controlplane/src/core/workers/QueueInactiveOrganizationsDeletionWorker.ts (2)

24-80: LGTM!

The queue implementation correctly uses upsertJobScheduler for scheduled execution. The hourly cron pattern '0 0 * * * *' (BullMQ format with seconds) is appropriate for periodic cleanup checks. The disabled addJob/getJob methods make sense since this queue only supports scheduled jobs.


100-103: Verify the startOfMonth normalization is intentional.

Using startOfMonth(subDays(now, MIN_INACTIVITY_DAYS)) can extend the inactivity window beyond 90 days. For example, on Jan 15, this would result in Oct 1 (~106 days ago) instead of Oct 17. If the intent is exactly 90 days, use subDays(now, MIN_INACTIVITY_DAYS) directly.

controlplane/src/core/workers/NotifyOrganizationDeletionQueuedWorker.ts (2)

18-60: LGTM!

The queue implementation follows the established pattern from other workers in the codebase, with appropriate retry settings and error handling.


117-139: LGTM!

The worker factory follows the established pattern with appropriate error and stall event handlers. The high concurrency (100) is reasonable for notification jobs that may be batched.

controlplane/src/core/build-server.ts (3)

56-63: LGTM!

The imports for the new queue and worker modules are correctly added.


405-434: LGTM!

The wiring of the new NotifyOrganizationDeletionQueuedQueue and QueueInactiveOrganizationsDeletionQueue follows the established pattern. Dependencies are correctly injected, workers are pushed to bullWorkers for graceful shutdown, and scheduleJob() is called to register the recurring cron job.


437-441: LGTM!

The raw-body plugin configuration with global: false and encoding: 'utf8' is appropriate for webhook-specific usage.

controlplane/package.json (1)

65-78: No breaking changes or security advisories with bullmq 5.66.4 and ioredis 5.8.2.

Both versions are valid and have no known security vulnerabilities. Since these are patch releases within the same major version (v5), no breaking changes are expected. The Redis configuration already correctly uses maxRetriesPerRequest: null as required for BullMQ v5 workers.

…organizations' into wilson/eng-7753-delete-inactive-organizations
constructor(log: pino.Logger, conn: ConnectionOptions) {
this.logger = log.child({ queue: QueueName });
this.queue = new Queue<QueueInactiveOrganizationsDeletionInput>(QueueName, {
connection: conn,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions:

  1. After max attempts, do we have a dead letter queue?
  2. How do we ensure rate limiting?
  3. Is the whole deletion process idempotent? In other words, can we replay the workflow at any failed stage?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Not currently
  2. I'm relying on the bullmq to avoid sending all the emails at once.
  3. The deletion queuing is idempotent, the deletion process itself relies on the organization deletion job which would throw an error (which just gets logged) and retry when the organization has already been deleted. We can update that so instead of throwing it just returns quietly.

@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Feb 11, 2026
@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Feb 27, 2026
@github-actions github-actions bot removed the monorepo label Mar 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@controlplane/src/core/repositories/OrganizationRepository.ts`:
- Line 976: The calculation of deleteAt uses the logical OR which treats 0 as
falsy and ignores an explicit zero-day delay; change the expression that
computes deleteAt to use the nullish coalescing operator so that
input.deleteDelayInDays === 0 is honored (i.e., use input.deleteDelayInDays ??
delayForManualOrgDeletionInDays when calling addDays(now, ...)); update the line
that calls addDays(now, ...) and ensure references to input.deleteDelayInDays
and delayForManualOrgDeletionInDays are used with ?? instead of || so explicit
zero values are respected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7ed0f611-b44e-4fba-80fc-879ec3e46652

📥 Commits

Reviewing files that changed from the base of the PR and between 70b005c and 37f0fb7.

📒 Files selected for processing (3)
  • controlplane/package.json
  • controlplane/src/core/build-server.ts
  • controlplane/src/core/repositories/OrganizationRepository.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • controlplane/package.json

@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Mar 26, 2026
@wilsonrivera wilsonrivera requested a review from pepol as a code owner March 30, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants