feat(metrics): add observable gauge for batch queue worker length #2848

ericallam · 2026-01-08T12:40:06Z

No description provided.

changeset-bot · 2026-01-08T12:40:10Z

⚠️ No Changeset found

Latest commit: d548cb6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-01-08T12:40:18Z

Walkthrough

This change modifies the batch queue module to expose a new observable metric for tracking worker queue length. It adjusts imports to include Logger and ObservableGauge types while removing unused RedisOptions, reorders exported type declarations, and introduces a workerQueueLengthGauge field that observes the batch worker queue length through a callback mechanism reading from workerQueueManager.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description is entirely missing; no information was provided by the author about testing, changelog, or any required template sections.	Add a complete pull request description following the repository template, including testing steps, changelog entry, and checklist confirmation.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding an observable gauge metric for monitoring batch queue worker length.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5b2ccb and d548cb6.

📒 Files selected for processing (1)

internal-packages/run-engine/src/batch-queue/index.ts

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

internal-packages/run-engine/src/batch-queue/index.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

internal-packages/run-engine/src/batch-queue/index.ts

**/*.{js,ts,jsx,tsx,json,md,css,scss}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier

Files:

internal-packages/run-engine/src/batch-queue/index.ts

**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: OpenTelemetry metric attributes must have low cardinality - use only enums, booleans, bounded error codes, or bounded shard IDs as attribute values
Do not use high-cardinality attributes in OpenTelemetry metrics: avoid UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)

Files:

internal-packages/run-engine/src/batch-queue/index.ts

🧠 Learnings (11)

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `runs.subscribeToBatch()` to subscribe to changes for all runs in a batch

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:26:58.661Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Use `trigger.dev/redis-worker` for background job and worker system needs in the webapp and run engine

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:26:37.432Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.trigger()` to trigger multiple different tasks at once from backend code

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerAndWait()` to batch trigger multiple different tasks and wait for results

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTaskAndWait()` to batch trigger tasks by passing task instances and wait for results

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.batchTrigger()` to trigger multiple runs of a single task with different payloads

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `batch.triggerByTask()` to batch trigger tasks by passing task instances for static task sets

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTriggerAndWait()` to batch trigger tasks and wait for all results from a parent task

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

📚 Learning: 2025-11-27T16:27:35.304Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Control concurrency using the `queue` property with `concurrencyLimit` option

Applied to files:

internal-packages/run-engine/src/batch-queue/index.ts

🧬 Code graph analysis (1)

internal-packages/run-engine/src/batch-queue/index.ts (2)

internal-packages/tracing/src/index.ts (1)

ObservableGauge (19-19)

packages/redis-worker/src/fair-queue/telemetry.ts (1)

meter (375-429)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)

GitHub Check: Cursor Bugbot
GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
GitHub Check: typecheck / typecheck
GitHub Check: claude-review
GitHub Check: Analyze (javascript-typescript)

🔇 Additional comments (2)

internal-packages/run-engine/src/batch-queue/index.ts (2)

1-21: LGTM! Import changes align with the new observable gauge feature.

The import adjustments correctly add the necessary types (ObservableGauge, Logger) and organize the redis-worker imports. All imports are properly used throughout the implementation.

87-87: LGTM! Observable gauge implementation follows established patterns.

The implementation correctly:

Declares the gauge field at the class level

Creates the observable gauge with appropriate description and unit

Registers an async callback that observes the worker queue length via workerQueueManager.getLength()

Avoids high-cardinality attributes (no IDs, unbounded values, etc.) in compliance with the coding guidelines

Follows the same pattern as the reference implementation in packages/redis-worker/src/fair-queue/telemetry.ts

The callback execution is safe: the arrow function captures this, and the OpenTelemetry SDK invokes the callback asynchronously after the constructor completes, ensuring workerQueueManager is initialized. Error handling is implicitly managed by the OpenTelemetry SDK, consistent with the codebase's established patterns.

Also applies to: 595-603

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-01-08T12:41:38Z

PR Review: feat(metrics): add observable gauge for batch queue worker length

Summary

This PR adds an observable gauge metric (batch_queue.worker_queue.length) to track the number of items waiting in the batch worker queue. This is a valuable addition for monitoring queue backlog and identifying processing bottlenecks.

Code Quality ✅

Strengths:

Follows existing patterns in the codebase (see similar implementations in run-queue/index.ts and packages/redis-worker/src/worker.ts)
Clean integration with the existing metrics initialization method
Proper metric naming with descriptive description and unit

Minor issues:

Typo in PR title: "guage" should be "gauge" - consider fixing for clarity in commit history
Import reorganization: The diff shows alphabetical reordering of imports and export statements. While this follows good practices, it's tangential to the core change. Consider keeping import changes minimal if they're not strictly necessary (to reduce noise in diffs).

Potential Issues ⚠️

Async callback in observable gauge: The callback passed to addCallback is async, which is generally fine with OpenTelemetry's API, but there are a few considerations:
- If this.workerQueueManager.getLength() throws or takes too long, the metric observation could fail silently or delay other metric collections
- No error handling wraps the Redis call - consider adding try/catch to prevent failures from affecting the metrics collection loop
Recommendation: Add error handling similar to patterns elsewhere:
```
this.workerQueueLengthGauge.addCallback(async (observableResult) => {
  try {
    const length = await this.workerQueueManager.getLength(BATCH_WORKER_QUEUE_ID);
    observableResult.observe(length);
  } catch (error) {
    this.logger.debug("Failed to collect worker queue length metric", { error });
    // Optionally observe 0 or skip
  }
});
```
Lifecycle consideration: The gauge callback references this.workerQueueManager. If the BatchQueue is closed (via close()), subsequent metric collection attempts will try to call getLength on a closed manager. While this is unlikely to cause issues (the metric collector should stop before resources are cleaned up), it's worth verifying this works correctly in practice.

Performance Considerations ✅

Observable gauges are pull-based, so the Redis call only happens when metrics are scraped (typically every 15-60 seconds)
The getLength call is a simple LLEN Redis command, which is O(1) and very fast
No performance concerns with this implementation

Security Concerns ✅

No security issues identified
The change only adds read-only metric collection

Test Coverage 📊

The existing test file (internal-packages/run-engine/src/batch-queue/tests/index.test.ts) doesn't pass a meter option, so metrics aren't initialized in tests
The metric itself is difficult to unit test without mocking the OpenTelemetry meter
Recommendation: Consider adding an integration test that verifies the metric is observable, or at minimum verify manually that the metric appears in your telemetry system

Overall Assessment

This is a clean, well-implemented addition that follows existing patterns in the codebase. The minor suggestions above are improvements rather than blockers. The main actionable item is adding error handling to the callback to make the metrics collection more resilient.

Verdict: Approve with minor suggestions 👍

vibe-kanban-cloud · 2026-01-08T12:42:42Z

Review Complete

Your review story is ready!

View Story

Comment !reviewfast on this PR to re-generate the story.

feat(metrics): add observable guage for batch queue worker length

d548cb6

nicktrn approved these changes Jan 8, 2026

View reviewed changes

myftija approved these changes Jan 8, 2026

View reviewed changes

ericallam changed the title ~~feat(metrics): add observable guage for batch queue worker length~~ feat(metrics): add observable gauge for batch queue worker length Jan 8, 2026

ericallam merged commit 36b0762 into main Jan 8, 2026
35 checks passed

ericallam deleted the ea-branch-113 branch January 8, 2026 12:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(metrics): add observable gauge for batch queue worker length #2848

feat(metrics): add observable gauge for batch queue worker length #2848

Uh oh!

ericallam commented Jan 8, 2026

Uh oh!

changeset-bot bot commented Jan 8, 2026

Uh oh!

coderabbitai bot commented Jan 8, 2026 •

edited

Loading

Uh oh!

claude bot commented Jan 8, 2026

Uh oh!

vibe-kanban-cloud bot commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

feat(metrics): add observable gauge for batch queue worker length #2848

feat(metrics): add observable gauge for batch queue worker length #2848

Uh oh!

Conversation

ericallam commented Jan 8, 2026

Uh oh!

changeset-bot bot commented Jan 8, 2026

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Uh oh!

claude bot commented Jan 8, 2026

PR Review: feat(metrics): add observable gauge for batch queue worker length

Summary

Code Quality ✅

Potential Issues ⚠️

Performance Considerations ✅

Security Concerns ✅

Test Coverage 📊

Overall Assessment

Uh oh!

vibe-kanban-cloud bot commented Jan 8, 2026

Review Complete

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coderabbitai bot commented Jan 8, 2026 •

edited

Loading