Skip to content

Intermittent stalled jobs with Bull 4.x when using concurrency > 1, related to ioredis minor version drift #2802

@andressalro

Description

@andressalro

Description

We experienced an intermittent issue with Bull 4.x where jobs are marked as stalled without transitioning to failed, and workers appear to stop processing jobs when concurrency > 1.
The same codebase works correctly with concurrency = 1.

After extensive comparison, the issue was traced to a minor version difference in ioredis, even though the Bull version and Node.js version were identical.

This issue may be relevant for other users running Bull 4.x in production environments (Kubernetes / Docker / Redis clusters).


Environment

Bull: 4.16.5
Node.js: v20.19.0


Observed Behavior

With concurrency = 1
Jobs are processed normally
No stalled events

With concurrency = 3
Jobs enter active
After some time, jobs are marked as stalled
No failed event is emitted
The worker stops processing further jobs ("silent stall")
No application-level errors are logged

This behavior is intermittent but reproducible under load.


Key Finding

Two independent workers, built from the same monorepo and running the same Bull and Node.js versions, exhibited different runtime behavior due to a minor version difference in the Redis client (ioredis).

Worker Bull ioredis
Worker A (stable) 4.16.5 5.8.2
Worker B (unstable) 4.16.5 5.9.0

The issue was not related to application logic, Redis infrastructure, or Bull configuration, but rather to dependency resolution drift.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions