Skip to content

Duplicate message_ids within the same channel #346

@sagium

Description

@sagium

Description

We had investigated a weird issue happening on the channel level. On several channels we have identified that it is possible to have duplicate message ids.

The result of this is something like

# example
[
  {global_id: 1, message_id: 1, channel: "channel_name", data: ""},
  {global_id: 2, message_id: 2, channel: "channel_name", data: ""},
  {global_id: 3, message_id: 3, channel: "channel_name", data: ""},
  {global_id: 57, message_id: 3, channel: "channel_name", data: ""},
  {global_id: 4, message_id: 4, channel: "channel_name", data: ""},
  {global_id: 58, message_id: 4, channel: "channel_name", data: ""},
  {global_id: 59, message_id: 5, channel: "channel_name", data: ""}
]

To reach that state you have to reset the backlog_id_key (message_id) key without resetting the backlog_key (channel backlog) itself.

The only way message_id gets reset as far as I know is through expiration inside the transactional LUA script but in many cases we didn't expect this to expire at all since new messages where added daily to the channel (default expiration is 7.days).

The code that affects us the most right now is the following
https://github.com/discourse/message_bus/blob/main/lib/message_bus/backends/redis.rb#L228-L238

since it gets the first of the items having the same score without taking into account the global_id order. In the usual case you will always have 1 item there, but in our case it happens to be 2, so it's very possible that we can consume stale data.

Questions

  • Are there other ways to reset or expire just the backlog_id_key in message bus channel lifecycle? That could explain this inconsistent state of the channel?
  • Did you ever face anything similar? The LUA script is nearly identical through the gem versions.
  • Is it reasonable to handle this edge case and always return the last item for a message_id based on the global_id on the redis.zrangebyscore backlog_key, message_id, message_id with a PR?

Versions

redis: 6.x
redis gem: 4.8.1
message_bus: 3.3.8 (we are upgrading to 4.3.8 since having duplicates results in infinite loops on the old version 🙃 )

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions