feat: implement global FIFO queue for Evals runs #7971

roomote · 2025-09-14T14:39:20Z

Summary

This PR implements a global FIFO queue for evaluation runs as requested in #7966. The implementation ensures only one run executes at a time, with additional runs queued automatically.

Changes

Queue Management (Redis-based)

Added Redis queue management functions in packages/evals/src/cli/redis.ts:
- evals:run-queue (LIST) for FIFO queue of run IDs
- evals:active-run (STRING) for currently executing run with TTL for crash safety
- evals:dispatcher:lock (STRING) for serializing dispatch operations
- Functions for enqueue, dequeue, queue position, and active run management

Run Creation & Dispatch

Modified createRun() in apps/web-evals/src/actions/runs.ts:
- Runs are now enqueued instead of immediately spawned
- Added dispatchNextRun() function that handles queue processing
- Implemented distributed locking to prevent race conditions

Auto-advance Mechanism

Updated runEvals() in packages/evals/src/cli/runEvals.ts:
- Clears active run status on completion
- Automatically dispatches next queued run
- Preserves per-run task concurrency via PQueue

UI Updates

Added Status column to runs list showing:
- Running: Active run with heartbeat
- Queued (#N): Position in queue
- Completed: Finished runs
Added cancel button for queued runs
Real-time status updates (5-second polling)

Key Features

✅ Global FIFO queue - Only one run executes at a time
✅ Automatic queue advancement - Next run starts when current completes
✅ Crash safety - TTL on active run and dispatcher lock
✅ Race condition handling - Distributed locking pattern
✅ Minimal UI changes - Status column and cancel button
✅ Preserved concurrency - Per-run task parallelism unchanged

Testing

Type checking passes (pnpm check-types)
Linting passes (pnpm lint)
Manual testing recommended for queue behavior

Notes

No database migration required (Redis-only implementation)
Future enhancement: Add runs.status column for analytics
Test coverage for new queue functions should be added in follow-up

Closes #7966

cc @hannesrudolph

Important

Implement a global FIFO queue for evaluation runs using Redis, ensuring single execution at a time with UI updates for real-time status.

Queue Management (Redis-based):
- Added Redis queue management functions in redis.ts for FIFO queue (evals:run-queue), active run (evals:active-run), and dispatcher lock (evals:dispatcher:lock).
- Functions for enqueue, dequeue, queue position, and active run management.
Run Creation & Dispatch:
- Modified createRun() in runs.ts to enqueue runs instead of immediate execution.
- Added dispatchNextRun() in runs.ts and runEvals.ts for queue processing with distributed locking.
Auto-advance Mechanism:
- Updated runEvals() in runEvals.ts to clear active run status and dispatch next run on completion.
UI Updates:
- Added Status column in run.tsx and runs.tsx to show run status (Running, Queued, Completed).
- Added cancel button for queued runs in run.tsx.
- Real-time status updates with 5-second polling in run.tsx.

^{This description was created by}^{for b58ce4e. You can customize this summary. It will automatically update as commits are pushed.}

- Add Redis-based queue management with run-queue, active-run, and dispatcher lock - Modify createRun to enqueue runs instead of immediate spawning - Implement auto-advance mechanism when runs complete - Add UI status column showing Running/Queued/Completed states - Add queue position display for queued runs - Add cancel button for queued runs - Preserve per-run task concurrency via PQueue Addresses issue #7966

ellipsis-dev · 2025-09-14T14:41:19Z

apps/web-evals/src/components/home/run.tsx

+		}
+	}, [run.id])
+
+	const getStatusBadge = () => {


User-facing strings (e.g. 'Loading...', 'Running', 'Queued', 'Completed', 'Unknown') are hardcoded. Consider using the i18n translation function to support multiple languages.

^{This comment was generated because it violated a code review rule: irule_C0ez7Rji6ANcGkkX.}

roomote

I reviewed my own code and found bugs I created 5 minutes ago. Classic.

roomote · 2025-09-14T14:43:21Z

apps/web-evals/src/actions/runs.ts

+		if (setActive !== "OK") {
+			// Another process may have set an active run, put this run back in the queue
+			console.log("Failed to set active run, requeueing")
+			await redis.lPush(getRunQueueKey(), runId.toString())


Is this intentional? Using lPush here means the run gets added to the front of the queue (LIFO) instead of the back (FIFO). This breaks the FIFO ordering when requeueing. Should this be rPush to maintain FIFO order?

Suggested change

await redis.lPush(getRunQueueKey(), runId.toString())

await redis.rPush(getRunQueueKey(), runId.toString())

roomote · 2025-09-14T14:43:21Z

apps/web-evals/src/actions/runs.ts

+			console.error(`Failed to spawn controller for run ${runId}:`, error)
+			// Clear active run and requeue on spawn failure
+			await redis.del(getActiveRunKey())
+			await redis.lPush(getRunQueueKey(), runId.toString())


Similar issue here - using lPush for requeueing on spawn failure breaks FIFO order. Consider using rPush to maintain the queue order:

Suggested change

await redis.lPush(getRunQueueKey(), runId.toString())

await redis.rPush(getRunQueueKey(), runId.toString())

roomote · 2025-09-14T14:43:21Z

packages/evals/src/cli/runEvals.ts

+		// Set as active run with generous TTL (1 hour)
+		const setActive = await setActiveRun(nextRunId, 3600)
+
+		if (!setActive) {


If setting the active run fails here, the run has already been dequeued but isn't being requeued. This could cause the run to be lost. Consider adding error recovery:

Suggested change

if (!setActive) {

if (!setActive) {

// This shouldn't happen but handle it gracefully

logger.error(`Failed to set run ${nextRunId} as active, requeueing`)

// Requeue the run at the front since it was just dequeued

const redis = await redisClient()

await redis.lPush(getRunQueueKey(), nextRunId.toString())

return

}

roomote · 2025-09-14T14:43:21Z

apps/web-evals/src/components/home/run.tsx

+
+		fetchStatus()
+		// Refresh status every 5 seconds for non-completed runs
+		const interval = !run.taskMetricsId ? setInterval(fetchStatus, 5000) : null


Could we clear the interval immediately when run.taskMetricsId becomes truthy to prevent potential memory leaks? The current logic might continue polling briefly after completion:

Suggested change

const interval = !run.taskMetricsId ? setInterval(fetchStatus, 5000) : null

// Refresh status every 5 seconds for non-completed runs

let interval: NodeJS.Timeout | null = null

if (!run.taskMetricsId) {

interval = setInterval(() => {

// Check if run completed during interval

if (run.taskMetricsId && interval) {

clearInterval(interval)

interval = null

} else {

fetchStatus()

}

}, 5000)

}

roomote · 2025-09-14T14:43:21Z

apps/web-evals/src/actions/runs.ts

+	childProcess.unref()
+}
+
+export async function dispatchNextRun() {


The dispatch logic is duplicated between this file and packages/evals/src/cli/runEvals.ts. Could we extract this to a shared module to avoid maintenance issues and ensure consistency? This would make future updates easier and reduce the risk of the implementations diverging.

roomote · 2025-09-14T14:43:22Z

packages/evals/src/cli/redis.ts

+	return activeRunId ? parseInt(activeRunId, 10) : null
+}
+
+export const setActiveRun = async (runId: number, ttlSeconds: number = 3600): Promise<boolean> => {


The default TTL of 3600 seconds (1 hour) is hardcoded here and in multiple other places. Consider defining this as a constant at the module level for easier configuration:

Suggested change

export const setActiveRun = async (runId: number, ttlSeconds: number = 3600): Promise<boolean> => {

const DEFAULT_ACTIVE_RUN_TTL = 3600 // 1 hour in seconds

export const setActiveRun = async (runId: number, ttlSeconds: number = DEFAULT_ACTIVE_RUN_TTL): Promise<boolean> => {

hannesrudolph

Suggestions: 1) Dispatcher lock: use a tokenized lock (store random token as value; release only if token matches) and increase/renew TTL to cover spawn time; see packages/evals/src/cli/runEvals.ts (https://github.com/RooCodeInc/Roo-Code/blob/b58ce4eecc598c5c554cfaab8d1a5c61743c7772/packages/evals/src/cli/runEvals.ts). 2) Atomicity: make dequeue -> setActive -> spawn atomic (WATCH/MULTI or Lua); consider BLMOVE/BRPOPLPUSH; see apps/web-evals/src/actions/runs.ts (https://github.com/RooCodeInc/Roo-Code/blob/b58ce4eecc598c5c554cfaab8d1a5c61743c7772/apps/web-evals/src/actions/runs.ts). 3) Active-run TTL: refresh alongside heartbeat so TTL cannot expire mid-run. 4) UI: avoid window.location.reload in cancel flow; prefer router.refresh or revalidatePath; see apps/web-evals/src/components/home/run.tsx. 5) Observability: add logs/metrics around dispatch decisions and lock acquisition.

hannesrudolph · 2025-09-14T19:27:36Z

Superseded by #7981: feat: global FIFO queue for Evals runs (#7966). Continuing discussion in #7981.

roomote bot requested review from cte, jr and mrubens as code owners September 14, 2025 14:39

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Sep 14, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Sep 14, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Sep 14, 2025

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Sep 14, 2025

ellipsis-dev bot reviewed Sep 14, 2025

View reviewed changes

roomote bot commented Sep 14, 2025

View reviewed changes

roomote bot mentioned this pull request Sep 14, 2025

[ENHANCEMENT] Global FIFO queue for Evals runs (1 at a time) #7966

Closed

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 14, 2025

hannesrudolph reviewed Sep 14, 2025

View reviewed changes

This was referenced Sep 14, 2025

feat: implement global FIFO queue for Evals runs #7967

Closed

feat: global FIFO queue for Evals runs (#7966) #7981

Closed

hannesrudolph closed this Sep 14, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Sep 14, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement global FIFO queue for Evals runs #7971

feat: implement global FIFO queue for Evals runs #7971

Uh oh!

roomote bot commented Sep 14, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot Sep 14, 2025

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Sep 14, 2025

Uh oh!

roomote bot Sep 14, 2025

Uh oh!

roomote bot Sep 14, 2025

Uh oh!

roomote bot Sep 14, 2025

Uh oh!

roomote bot Sep 14, 2025

Uh oh!

roomote bot Sep 14, 2025

Uh oh!

hannesrudolph left a comment

Uh oh!

hannesrudolph commented Sep 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	await redis.lPush(getRunQueueKey(), runId.toString())
	await redis.rPush(getRunQueueKey(), runId.toString())

-		if (!setActive) {
+		if (!setActive) {
+			// This shouldn't happen but handle it gracefully
+			logger.error(`Failed to set run ${nextRunId} as active, requeueing`)
+			// Requeue the run at the front since it was just dequeued
+			const redis = await redisClient()
+			await redis.lPush(getRunQueueKey(), nextRunId.toString())
+			return
+		}

-		const interval = !run.taskMetricsId ? setInterval(fetchStatus, 5000) : null
+		// Refresh status every 5 seconds for non-completed runs
+		let interval: NodeJS.Timeout | null = null
+		if (!run.taskMetricsId) {
+			interval = setInterval(() => {
+				// Check if run completed during interval
+				if (run.taskMetricsId && interval) {
+					clearInterval(interval)
+					interval = null
+				} else {
+					fetchStatus()
+				}
+			}, 5000)
+		}

-export const setActiveRun = async (runId: number, ttlSeconds: number = 3600): Promise<boolean> => {
+const DEFAULT_ACTIVE_RUN_TTL = 3600 // 1 hour in seconds
+export const setActiveRun = async (runId: number, ttlSeconds: number = DEFAULT_ACTIVE_RUN_TTL): Promise<boolean> => {

feat: implement global FIFO queue for Evals runs #7971

feat: implement global FIFO queue for Evals runs #7971

Uh oh!

Conversation

roomote bot commented Sep 14, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Queue Management (Redis-based)

Run Creation & Dispatch

Auto-advance Mechanism

UI Updates

Key Features

Testing

Notes

Uh oh!

ellipsis-dev bot Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

hannesrudolph left a comment

Choose a reason for hiding this comment

Uh oh!

hannesrudolph commented Sep 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

roomote bot commented Sep 14, 2025 •

edited by ellipsis-dev bot

Loading