Skip to content

CHATHISTORY: clarify behaviour when messages have no consistent total orderingΒ #540

@spb

Description

@spb

The chathistory spec should be more explicit about what behaviour is guaranteed when using a msgid for before/after selection and crossing multiple servers. There is a SHOULD recommendation that message ordering is consistent within the lifetime of a connection, but this is not sufficient to cover important use cases.

The example use case here is that a client is disconnected from a server that shuts down, reconnects to another server in the network, and uses 'LATEST * msgid=' to populate new messages since the last one it saw. A naive implementation could produce duplicate messages, missing messages, or both in this situation.

I see three broad approaches to this problem:

  1. Add words to the spec to the clarify that messages since the last ID seen from a different connection may be missing message or include duplicates. This is easiest for servers but means that clients can't splice together partial histories received from different connections.
  2. Allow duplicate messages, but require implementations to include all messages which are not guaranteed to have been delivered before the given msgid. Leave it to the implementation to work out how to do this.
  3. Develop and specify an explicit client-mediated synchronisation mechanism in the protocol. This would probably take the form of an additional message tag whose value encodes the complete information about which messages have been seen in some implementation-specific manner.

Personally I'd favour option 2:

  • I think the possibility of having what looks like a complete history but which is (silently and undetectably) missing messages is unacceptable (especially given prior experience with Matrix, which has suffered from this problem regularly).
  • A correct client implementation in option 1 (or the current state of the spec) would have to completely discard their existing channel history on every reconnection, and repopulate it fully based on the ordering which is consistent within the lifetime of the new connection. This seems unnecessarily wasteful.
  • Option 3 adds significant complexity for both clients and servers, and would be a possibility in my view only if server authors think that 2 is impractical.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions