-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Heya - I'm running long-running jobs (jobTimeoutMs is set to 2 hours). When calling worker.close() to gracefully shutdown the worker, I've noticed that if there are a) jobs in progress and b) those jobs do not finish within gracefulTimeoutMs, then upon restart, the group lock will not be released and the job will stall. I see error messages like the following:
⚠️ Blocking found group but reserve failed: group=connection:1 (reserve took 1ms)
Here's a minimal example that can be used to reproduce the issue:
import { Queue, Worker } from 'groupmq';
import Redis from 'ioredis';
const QUEUE_NAME = 'repro-queue';
const GROUP_ID = 'test-group';
const redis = new Redis("redis://127.0.0.1:6379");
const queue = new Queue({
redis,
namespace: QUEUE_NAME,
jobTimeoutMs: 1000 * 60 * 5,
logger: true
});
const worker = new Worker({
queue,
handler: async (job) => {
console.log('Processing job', job.id);
// Long running job (10 minutes)
await new Promise((resolve) => setTimeout(resolve, 1000 * 60 * 10))
},
logger: true
});
worker.run();
process.on('SIGINT', async () => {
console.log('exiting...');
// Graceful timeout less than the job duration
await worker.close(5 * 1000);
await queue.close();
process.exit(0);
});
const job = await queue.add({
groupId: GROUP_ID,
data: {}
});
console.log('Job added', job.id);Steps:
- Run the above script and wait till you see
Processing job ...in the console log - Interrupt the script with ctrl+c
- Start the script again
- You should see output such as the following:
⚠️ [groupmq:repro-queue] Blocking found group but reserve failed: group=test-group (reserve took 1ms)
⚠️ [repro-queue] STUCK WORKER ALERT: No activity for 180s
[repro-queue] 📊 Status Report:
[repro-queue] 🔢 Jobs Processed: 0
[repro-queue] ⏱️ Last Job: nevers ago
[repro-queue] 🚫 Consecutive Empty Reserves: 47
[repro-queue] 📞 Total Blocking Calls: 47
[repro-queue] 📈 Queue Stats: Active=1, Waiting=1, Delayed=0, Groups=test-group
[repro-queue] 🔄 Currently Processing: 0 jobs
[repro-queue] Fetching job (call #48, queue: 0/1)...
[groupmq:repro-queue] Starting blocking operation (timeout: 5s, consecutive empty: 0)
[groupmq:repro-queue] Blocking result: group=test-group, score=58779507660002 (took 1ms)
Metadata
Metadata
Assignees
Labels
No labels