Skip to content

[GOWS] - GOWS gRPC Connection Drops Due to Slow Event Processing (Backlog) #1759

@bergpinheiro

Description

@bergpinheiro

GOWS gRPC Connection Drops Due to Slow Event Processing (Backlog)

Description

When processing multiple messages with media attachments, the GOWS engine starts dropping events due to slow listener processing. This eventually causes gRPC connection failures (code 14 UNAVAILABLE) and session disconnections. The issue occurs because media downloads block event processing, creating a backlog that the GOWS server cannot handle.

Environment

  • Engine: GOWS
  • Container: Docker (hostname: d5121aedd1ea)
  • Affected Sessions: Multiple sessions simultaneously
  • Trigger: High volume of messages with media attachments

Symptoms

  1. GOWS logs show repeated "Dropping event for slow listener" warnings
  2. Multiple gRPC stream errors with code 14 UNAVAILABLE: Connection dropped
  3. Sessions disconnect and require restart
  4. Occurs specifically when processing messages with media attachments
  5. System receives SIGTERM during shutdown, but connections are already broken

Timeline of Events

  1. 10:36 PM - GOWS starts dropping events for slow listener (session 1_10_550000...)
  2. 10:36 PM - Media download processing observed for message ACED3D1B1E9E7E2A6373C2A2193FA815
  3. 10:38 PM - Multiple gRPC streams fail with "14 UNAVAILABLE: Connection dropped" (sessions 1_15, 1_43, 1_10 1_27)
  4. 10:38 PM - System receives SIGTERM and initiates graceful shutdown
  5. 10:38 PM - All sessions stop with errors during shutdown

Logs

Initial Warning (10:36PM)

2025/12/29 10:36PM 40 pid=7 hostname=d5121aedd1ea engine=gows msg=[gRPC] Dropping event for slow listener 17dd12a5-bfce-4378-80a5-46b2981144c2, session 1_10_550000...
2025/12/29 10:36PM 40 pid=7 hostname=d5121aedd1ea engine=gows msg=[gRPC] Dropping event for slow listener 17dd12a5-bfce-4378-80a5-46b2981144c2, session 1_10_550000...
2025/12/29 10:36PM 40 pid=7 hostname=d5121aedd1ea engine=gows msg=[gRPC] Dropping event for slow listener 17dd12a5-bfce-4378-80a5-46b2981144c2, session 1_10_550000...
2025/12/29 10:36PM 40 pid=7 hostname=d5121aedd1ea engine=gows msg=[gRPC] Dropping event for slow listener 17dd12a5-bfce-4378-80a5-46b2981144c2, session 1_10_550000...

Media Processing

2025/12/29 10:36PM 40 pid=7 hostname=d5121aedd1ea engine=gows msg=[Session/4_24_5585992134070/Client] Unavailable message ACED3D1B1E9E7E2A6373C2A2193FA815 from 24XXXXXXXXXXXXX@lid (type: "")
2025/12/29 10:36PM 30 pid=7 hostname=d5121aedd1ea reqId=236379 session=1_10_550000... name=MediaManager msg=The message ACED3D1B1E9E7E2A6373C2A2193FA815 has media, downloading it...
2025/12/29 10:36PM 30 pid=7 hostname=d5121aedd1ea reqId=236379 session=1_10_550000... name=MediaManager msg=The media from 'ACED3D1B1E9E7E2A6373C2A2193FA815' has been saved.

gRPC Connection Failures (10:38PM)

2025/12/29 10:38PM 50 pid=7 hostname=d5121aedd1ea reqId=222906 session=1_11_550000... grpc=stream id=e93b937b68ca4fba9e4e60b8a11dd98f err={"type":"Error","message":"14 UNAVAILABLE: Connection dropped","stack":"Error: 14 UNAVAILABLE: Connection dropped\n    at callErrorFromStatus (/app/node_modules/@grpc/grpc-js/build/src/call.js:32:19)\n    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client.js:359:73)\n    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:327:181)\n    at /app/node_modules/@grpc/grpc-js/build/src/resolving-call.js:135:78\n    at process.processTicksAndRejections (node:internal/process/task_queues:84:11)","code":14,"details":"Connection dropped","metadata":{}} msg=Stream error

2025/12/29 10:38PM 50 pid=7 hostname=d5121aedd1ea reqId=224072 session=1_12_550000... grpc=stream id=d75c3d9e533941cc8fa3578949ec0cd5 err={"type":"Error","message":"14 UNAVAILABLE: Connection dropped","stack":"Error: 14 UNAVAILABLE: Connection dropped\n    at callErrorFromStatus (/app/node_modules/@grpc/grpc-js/build/src/call.js:32:19)\n    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client.js:359:73)\n    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:327:181)\n    at /app/node_modules/@grpc/grpc-js/build/src/resolving-call.js:135:78\n    at process.processTicksAndRejections (node:internal/process/task_queues:84:11)","code":14,"details":"Connection dropped","metadata":{}} msg=Stream error

2025/12/29 10:38PM 50 pid=7 hostname=d5121aedd1ea reqId=236379 session=1_44_550000..... grpc=stream id=f4a11fbcdd124fbfaf558bcf7b59941f err={"type":"Error","message":"14 UNAVAILABLE: Connection dropped","stack":"Error: 14 UNAVAILABLE: Connection dropped\n    at callErrorFromStatus (/app/node_modules/@grpc/grpc-js/build/src/call.js:32:19)\n    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client.js:359:73)\n    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:327:181)\n    at /app/node_modules/@grpc/grpc-js/build/src/resolving-call.js:135:78\n    at process.processTicksAndRejections (node:internal/process/task_queues:84:11)","code":14,"details":"Connection dropped","metadata":{}} msg=Stream error

Shutdown Sequence

2025/12/29 10:38PM 30 pid=7 hostname=d5121aedd1ea name=Bootstrap msg=SIGTERM received
2025/12/29 10:38PM 30 pid=7 hostname=d5121aedd1ea name=SessionManager msg=Stopping all sessions...
2025/12/29 10:38PM 40 pid=7 hostname=d5121aedd1ea name=SessionManager session=1_11_550000.... msg=Error while stopping session
2025/12/29 10:38PM 30 pid=7 hostname=d5121aedd1ea name=SessionManager session=1_11_550000.... msg=Session has been stopped.
...
2025/12/29 10:38PM 30 pid=7 hostname=d5121aedd1ea name=SessionManager msg=All sessions have been stopped.
2025/12/29 10:38PM 30 pid=7 hostname=d5121aedd1ea engine=gows msg=Stopping GOWS subprocess...
2025/12/29 10:38PM 30 pid=7 hostname=d5121aedd1ea engine=gows msg=GOWS subprocess stopped

Root Cause Analysis

The issue is caused by unbounded concurrency in message processing at src/core/engines/gows/session.gows.core.ts (lines 481, 493):

messagesFromMe$ = messagesFromMe$.pipe(
  mergeMap((msg) => this.processIncomingMessage(msg, true)),
  // No concurrency limit!
);

Why This Happens

  1. Multiple messages with media arrive simultaneously
  2. Each message triggers processIncomingMessage(msg, true) which downloads media synchronously
  3. Media downloads can take several seconds (timeout: 10 minutes per download)
  4. mergeMap without concurrency limit allows unlimited parallel downloads
  5. Event processing queue fills up faster than it can be consumed
  6. GOWS detects the slow listener and starts dropping events
  7. After ~2 minutes of backlog, gRPC connections fail with UNAVAILABLE errors

Steps to Reproduce

  1. Set up WAHA with GOWS engine
  2. Have multiple active sessions receiving messages
  3. Send/receive multiple messages with media attachments simultaneously (e.g., images, videos)
  4. Observe GOWS logs for "Dropping event for slow listener"
  5. Shortly after, gRPC connections will fail

Expected Behavior

  • Event processing should keep up with incoming messages
  • Media downloads should not block the event stream
  • gRPC connections should remain stable

Actual Behavior

  • Events are dropped due to slow processing
  • gRPC connections fail with code 14 UNAVAILABLE
  • Sessions disconnect and need to be restarted

Waha info:

GOWS (2025.11.2 PLUS)

patron:PRO

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions