Clarification on DLQ usage and handling of user-caused delivery failures in SQS setups

We’re running Outpost with SQS and noticed some DLQ behavior we’d like to understand better.

### Context

Our SQS DLQ regularly receives messages that correspond to **normal user-caused delivery failures** (e.g., invalid webhook URLs and endpoints returning 4xx). These aren’t system issues, but more like expected delivery failures that Outpost retries and tracks.

However, because the worker never reaches a “successful processing” state for these events, SQS eventually moves them to the DLQ.

### The question

We think DLQ should represent **pipeline-level failures** (parsing errors, crashes, internal outages), while user-misconfigurations should be treated as handled delivery failures within Outpost.

So we want to understand:

1. Is it expected that user-level delivery failures end up in the DLQ?
2. Should these failures be acknowledged as “processed” from the queue’s perspective, even if the webhook call failed?
3. Is there a recommended approach to avoid mixing user misconfigurations with actual Outpost/system failures in the DLQ?
4. Would enabling failure alerts (`ALERT_CALLBACK_URL`) change how messages are acknowledged?

### Goal

We would like DLQ to signal **real processing issues**, not just tenants misconfiguring their webhook destinations. Any clarification or guidance would be greatly appreciated!

Thanks for the great project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarification on DLQ usage and handling of user-caused delivery failures in SQS setups #571

Context

The question

Goal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on DLQ usage and handling of user-caused delivery failures in SQS setups #571

Description

Context

The question

Goal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions