From 0e6997d14b25ca06ac56f51b61bafe0a373c8ac2 Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Thu, 23 Feb 2023 16:45:45 +0000 Subject: [PATCH 1/8] MSC3970: Scope transaction IDs to devices --- proposals/3970-device-scope-txnid.md | 198 +++++++++++++++++++++++++++ 1 file changed, 198 insertions(+) create mode 100644 proposals/3970-device-scope-txnid.md diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md new file mode 100644 index 00000000000..2aba213badd --- /dev/null +++ b/proposals/3970-device-scope-txnid.md @@ -0,0 +1,198 @@ +# MSC3970: Scope transaction IDs to devices + +Transaction identifiers in the Client-Server API are currently scoped to the concept of a "client session" which, when +refresh tokens are used, can span a sequence of access tokens. + +The spec [reads](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers): + +> The scope of a transaction ID is a “client session”, where that session is identified by a particular access token. +> When refreshing an access token, the transaction ID’s scope is retained. This means that if a client with token `A` +> uses `TXN1` as their transaction ID, refreshes the token to `B`, and uses `TXN1` again it’ll be assumed to be a +> duplicate request and ignored. If the client logs out and back in between the `A` and `B` tokens, `TXN1` could be used +> once for each. + +The transaction IDs appear in two parts of the Client-Server API spec: + +1. As a identifier to allow the homeserver to make some `PUT` endpoints [idempotent](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers) +2. An unsigned field in the event data model for a client to tell if it sent an event or not. a.k.a. solving the ["local echo"](https://spec.matrix.org/v1.6/client-server-api/#local-echo) problem + +For reference, the `PUT` endpoitns that have the a `{txnId}` param are: + +- [`PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3roomsroomidsendeventtypetxnid) +- [`PUT /_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3roomsroomidredacteventidtxnid) +- [`PUT /_matrix/client/v3/sendToDevice/{eventType}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid) + +## Proposal + +It is proposed that the scope of transaction identifiers be changed from a "client session" to a "device". + +A "device" is typically represented by a `device_id` elsewhere in the spec. + +For idempotency, this means the homeserver changing the method of identifying a request from: + +- (`client session`, `HTTP path of request which includes the transaction ID`) + +to: + +- (`device_id`, `HTTP path of request which includes the transaction ID`) + +For local echo, the homeserver would now include the `transaction_id` in the event data when it is serving a sync +request from the same `device_id` as determined from the access token. + +## Potential issues + +### This is technically a breaking change to the spec + +The main "issue" I see with this proposal is that this is technically a breaking change to the spec. + +Because a device ID could have multiple sequences of access tokens associated with it, this proposal would widen the +scope of the transaction ID. + +Therefore it could potentially lead to a request being treated as "new" where before it would have been identified as a +retransmission and deduplicated. + +However, the evidence suggests that nothing would be impacted in reality. + +Some evidences have been collated to support this claim: + +#### 1. Data from the matrix.org Homeserver suggests the change would have no impact + +The matrix.org Homeserver is a reasonable size deployment and could be considered reasonably representative of the +diversity of Matrix client. + +The Synapse homeserver that runs matrix.org maintains a `event_txn_id` table that contains a rolling 24 hour window of +(`user_id`, `token_id`, `room_id`, `txn_id`) tuples. + +Having analysed the contents of the table, it appears that there are no repeated transaction IDs for a given user, token +and room. + +There are other `PUT` endpoints for which transaction IDs are not in the `event_txn_id` table, however because the event + +As such, the widening of the scope from token to device would not have caused any issues during the periods sampled. + +For reference the following is the schema of the `event_txn_id` table: + +```sql + Table "matrix.event_txn_id" + Column | Type | Collation | Nullable | Default +-------------+--------+-----------+----------+--------- + event_id | text | | not null | + room_id | text | | not null | + user_id | text | | not null | + token_id | bigint | | not null | + txn_id | text | | not null | + inserted_ts | bigint | | not null | +Indexes: + "event_txn_id_token_id" btree (token_id) + "event_txn_id_event_id" UNIQUE, btree (event_id) + "event_txn_id_ts" btree (inserted_ts) + "event_txn_id_txn_id" UNIQUE, btree (room_id, user_id, token_id, txn_id) +Foreign-key constraints: + "event_txn_id_event_id_fkey" FOREIGN KEY (event_id) REFERENCES matrix.events(event_id) ON DELETE CASCADE + "event_txn_id_token_id_fkey" FOREIGN KEY (token_id) REFERENCES matrix.access_tokens(id) ON DELETE CASCADE +``` + +And the query to look for repeated transaction IDs: + +```sql +SELECT e1.txn_id, LEFT(e1.user_id, 5) AS user_id, e1.token_id, e2.token_id, e1.inserted_ts, e2.inserted_ts FROM matrix.event_txn_id e1, matrix.event_txn_id e2 WHERE e1.txn_id = e2.txn_id AND e1.event_id <> e2.event_id AND e1.event_id < e2.event_id AND e1.user_id = e2.user_id AND e1.room_id = e2.room_id ORDER BY e1.token_id; + txn_id | user_id | token_id | token_id | inserted_ts | inserted_ts +--------+---------+----------+----------+-------------+------------- +(0 rows) +``` + +#### 2. Conduit homeserver already scopes transaction IDs to devices + +As highlighted by the new Complement [tests](https://github.com/matrix-org/complement/pull/613) the Conduit homeserver +is already scoping transaction IDs to devices. + +I can't find a related issue [listed](https://gitlab.com/famedly/conduit/-/issues), so presumably this non-compliant +behaviour isn't causing a known issue for admins and users of the Conduit homeserver? + +#### 3. Synapse homeserver only checks for retransmits over a 30-60 minute window + +The Synapse homeserver uses an in-memory cache to check for idempotency of requests. The cache is vacated after +[30-60 minutes](https://github.com/matrix-org/synapse/blob/adac949a417d064958039ae0918b97388413c824/synapse/rest/client/transactions.py#L50-L52) +and is not persisted between restarts (or across a cluster). + +This means that for many existing deployments idempotency is only actually enforced over a 30-60 minutes and not +eternally as the spec might suggest. + +#### 4. Synapse homeserver only supports local echo for the previous 24 hours + +The Synapse homeserver only supports local echo (by the presence of `transaction_id` on sync responses) for the previous +24 hours. This is because the `event_txn_id` table is only kept for 24 hours. + +Again, this suggests that in reality the local echo semantics are not preserved eternally as the spec might suggest. + +### Is the "device" concept the right level of abstraction to use? + +One way to look at it is that device is already widely used in the end-to-end encryption parts of the spec and so why +isn't it suitable for this use case too? + +### What about two clients masquerading as a single device ID? + +I don't know if this actually works in practice. If this was a concern then it could be mitigated by clarifying in the +spec that if a client wishes to submit requests using the same `device_id` as another client session that it should +choose transaction identifiers that are unique to that client session. + +## Alternatives + +### Do nothing + +We could leave the transaction ID scope as is. + +However, it makes it difficult to implement a features like [MSC3861: Matrix architecture change to delegate authentication via OIDC](https://github.com/matrix-org/matrix-spec-proposals/pull/3861) +as the concept of a "client session" doesn't really exist in OIDC. + +As noted above, at least one homeserver implementation is also not implementing the spec as it is today. + +It also turns out that the current implementation of refresh tokens in Synapse breaks the transaction ID semantics +already and needs to be [fixed](https://github.com/matrix-org/synapse/issues/15141). + +### Make a backwards compatible change + +A backwards compatible alternative could be something like: + +1. For idempotency have clients opt-in to a new scope of transaction ID, but support the current semantics too for compatibility +2. Have clients opt-in (e.g. request param on the sync endpoint) to receiving transaction ID for all events in the sync response and make the client responsible for identifying which messages they sent + +The disadvantage of this is that we create a load of extra maintenance work to support both semantics for a period of +time for (empirically) no gain in return. + +## Security considerations + +A malicious client can adopt an existing device ID of a user. This could possibly allow some kind of denial of service attack. + +However, if such an attack where possible it would be possible to do so without this MSC as device IDs are crucial to +the implementation of end-to-end encryption. + +## Other recommendations + +I'm not suggesting that these recommendations are address in this proposal, but more notes for future proposals or spec clarifications. + +### Clarification on idempotency semantics + +I have separately prepared a [spec PR](https://github.com/matrix-org/matrix-spec/pull/1449) to clarify some of the +idempotency semantics that doesn't modify the spec but is useful to understand the context of this proposal. + +### Clarification on transaction ID time scope + +I also suggest that the spec be clarified over what time periods the transaction ID is scoped for such that clients can +be aware. This cloud simply be to say that the time period is not defined and so may vary by implementation. + +### Suggest a lease naive transaction ID format + +I think we should sense to update the recommendation on the format of transaction IDs from: + +> how is not specified; a monotonically increasing integer is recommended + +To something more unique (in my research I have found some clients use a seconds since epoch which doesn't seem ideal) + +## Unstable prefix + +None needed. + +## Dependencies + +None. From 03ac931e8f02d168a6a297ec4d501091c93db731 Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Tue, 28 Feb 2023 18:02:44 +0000 Subject: [PATCH 2/8] Line length --- proposals/3970-device-scope-txnid.md | 143 ++++++++++++++++----------- 1 file changed, 87 insertions(+), 56 deletions(-) diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md index 2aba213badd..15289908443 100644 --- a/proposals/3970-device-scope-txnid.md +++ b/proposals/3970-device-scope-txnid.md @@ -1,22 +1,28 @@ # MSC3970: Scope transaction IDs to devices -Transaction identifiers in the Client-Server API are currently scoped to the concept of a "client session" which, when -refresh tokens are used, can span a sequence of access tokens. +Transaction identifiers in the Client-Server API are currently scoped to the +concept of a "client session" which, when refresh tokens are used, can span a +sequence of access tokens. The spec [reads](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers): -> The scope of a transaction ID is a “client session”, where that session is identified by a particular access token. -> When refreshing an access token, the transaction ID’s scope is retained. This means that if a client with token `A` -> uses `TXN1` as their transaction ID, refreshes the token to `B`, and uses `TXN1` again it’ll be assumed to be a -> duplicate request and ignored. If the client logs out and back in between the `A` and `B` tokens, `TXN1` could be used +> The scope of a transaction ID is a “client session”, where that session is +> identified by a particular access token. When refreshing an access token, the +> transaction ID’s scope is retained. This means that if a client with token `A` +> uses `TXN1` as their transaction ID, refreshes the token to `B`, and uses +> `TXN1` again it’ll be assumed to be a duplicate request and ignored. If the +> client logs out and back in between the `A` and `B` tokens, `TXN1` could be used > once for each. The transaction IDs appear in two parts of the Client-Server API spec: -1. As a identifier to allow the homeserver to make some `PUT` endpoints [idempotent](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers) -2. An unsigned field in the event data model for a client to tell if it sent an event or not. a.k.a. solving the ["local echo"](https://spec.matrix.org/v1.6/client-server-api/#local-echo) problem +1. As a identifier to allow the homeserver to make some `PUT` endpoints +[idempotent](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers) +2. An unsigned field in the event data model for a client to tell if it sent an +event or not. a.k.a. solving the +["local echo"](https://spec.matrix.org/v1.6/client-server-api/#local-echo) problem -For reference, the `PUT` endpoitns that have the a `{txnId}` param are: +For reference, the `PUT` endpoints that have the a `{txnId}` param are: - [`PUT /_matrix/client/v3/rooms/{roomId}/send/{eventType}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3roomsroomidsendeventtypetxnid) - [`PUT /_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}`](https://spec.matrix.org/v1.6/client-server-api/#put_matrixclientv3roomsroomidredacteventidtxnid) @@ -24,11 +30,13 @@ For reference, the `PUT` endpoitns that have the a `{txnId}` param are: ## Proposal -It is proposed that the scope of transaction identifiers be changed from a "client session" to a "device". +It is proposed that the scope of transaction identifiers be changed from a +"client session" to a "device". A "device" is typically represented by a `device_id` elsewhere in the spec. -For idempotency, this means the homeserver changing the method of identifying a request from: +For idempotency, this means the homeserver changing the method of identifying a +request from: - (`client session`, `HTTP path of request which includes the transaction ID`) @@ -36,20 +44,22 @@ to: - (`device_id`, `HTTP path of request which includes the transaction ID`) -For local echo, the homeserver would now include the `transaction_id` in the event data when it is serving a sync -request from the same `device_id` as determined from the access token. +For local echo, the homeserver would now include the `transaction_id` in the +event data when it is serving a sync request from the same `device_id` as +determined from the access token. ## Potential issues ### This is technically a breaking change to the spec -The main "issue" I see with this proposal is that this is technically a breaking change to the spec. +The main "issue" I see with this proposal is that this is technically a breaking +change to the spec. -Because a device ID could have multiple sequences of access tokens associated with it, this proposal would widen the -scope of the transaction ID. +Because a device ID could have multiple sequences of access tokens associated +with it, this proposal would widen the scope of the transaction ID. -Therefore it could potentially lead to a request being treated as "new" where before it would have been identified as a -retransmission and deduplicated. +Therefore it could potentially lead to a request being treated as "new" where +before it would have been identified as a retransmission and deduplicated. However, the evidence suggests that nothing would be impacted in reality. @@ -57,18 +67,21 @@ Some evidences have been collated to support this claim: #### 1. Data from the matrix.org Homeserver suggests the change would have no impact -The matrix.org Homeserver is a reasonable size deployment and could be considered reasonably representative of the -diversity of Matrix client. +The matrix.org Homeserver is a reasonable size deployment and could be considered +reasonably representative of the diversity of Matrix client. -The Synapse homeserver that runs matrix.org maintains a `event_txn_id` table that contains a rolling 24 hour window of +The Synapse homeserver that runs matrix.org maintains a `event_txn_id` table +that contains a rolling 24 hour window of (`user_id`, `token_id`, `room_id`, `txn_id`) tuples. -Having analysed the contents of the table, it appears that there are no repeated transaction IDs for a given user, token -and room. +Having analysed the contents of the table, it appears that there are no repeated +transaction IDs for a given user, token and room. -There are other `PUT` endpoints for which transaction IDs are not in the `event_txn_id` table, however because the event +There are other `PUT` endpoints for which transaction IDs are not in the +`event_txn_id` table, however because the event -As such, the widening of the scope from token to device would not have caused any issues during the periods sampled. +As such, the widening of the scope from token to device would not have caused +any issues during the periods sampled. For reference the following is the schema of the `event_txn_id` table: @@ -103,37 +116,43 @@ SELECT e1.txn_id, LEFT(e1.user_id, 5) AS user_id, e1.token_id, e2.token_id, e1.i #### 2. Conduit homeserver already scopes transaction IDs to devices -As highlighted by the new Complement [tests](https://github.com/matrix-org/complement/pull/613) the Conduit homeserver +As highlighted by the new Complement +[tests](https://github.com/matrix-org/complement/pull/613) the Conduit homeserver is already scoping transaction IDs to devices. -I can't find a related issue [listed](https://gitlab.com/famedly/conduit/-/issues), so presumably this non-compliant -behaviour isn't causing a known issue for admins and users of the Conduit homeserver? +I can't find a related issue [listed](https://gitlab.com/famedly/conduit/-/issues), +so presumably this non-compliant behaviour isn't causing a known issue for +admins and users of the Conduit homeserver? #### 3. Synapse homeserver only checks for retransmits over a 30-60 minute window -The Synapse homeserver uses an in-memory cache to check for idempotency of requests. The cache is vacated after +The Synapse homeserver uses an in-memory cache to check for idempotency of +requests. The cache is vacated after [30-60 minutes](https://github.com/matrix-org/synapse/blob/adac949a417d064958039ae0918b97388413c824/synapse/rest/client/transactions.py#L50-L52) and is not persisted between restarts (or across a cluster). -This means that for many existing deployments idempotency is only actually enforced over a 30-60 minutes and not -eternally as the spec might suggest. +This means that for many existing deployments idempotency is only actually +enforced over a 30-60 minutes and not eternally as the spec might suggest. #### 4. Synapse homeserver only supports local echo for the previous 24 hours -The Synapse homeserver only supports local echo (by the presence of `transaction_id` on sync responses) for the previous -24 hours. This is because the `event_txn_id` table is only kept for 24 hours. +The Synapse homeserver only supports local echo (by the presence of +`transaction_id` on sync responses) for the previous 24 hours. This is because +the `event_txn_id` table is only kept for 24 hours. -Again, this suggests that in reality the local echo semantics are not preserved eternally as the spec might suggest. +Again, this suggests that in reality the local echo semantics are not preserved +eternally as the spec might suggest. ### Is the "device" concept the right level of abstraction to use? -One way to look at it is that device is already widely used in the end-to-end encryption parts of the spec and so why -isn't it suitable for this use case too? +One way to look at it is that device is already widely used in the end-to-end +encryption parts of the spec and so why isn't it suitable for this use case too? ### What about two clients masquerading as a single device ID? -I don't know if this actually works in practice. If this was a concern then it could be mitigated by clarifying in the -spec that if a client wishes to submit requests using the same `device_id` as another client session that it should +I don't know if this actually works in practice. If this was a concern then it +could be mitigated by clarifying in the spec that if a client wishes to submit +requests using the same `device_id` as another client session that it should choose transaction identifiers that are unique to that client session. ## Alternatives @@ -142,52 +161,64 @@ choose transaction identifiers that are unique to that client session. We could leave the transaction ID scope as is. -However, it makes it difficult to implement a features like [MSC3861: Matrix architecture change to delegate authentication via OIDC](https://github.com/matrix-org/matrix-spec-proposals/pull/3861) +However, it makes it difficult to implement a features like +[MSC3861: Matrix architecture change to delegate authentication via OIDC](https://github.com/matrix-org/matrix-spec-proposals/pull/3861) as the concept of a "client session" doesn't really exist in OIDC. -As noted above, at least one homeserver implementation is also not implementing the spec as it is today. +As noted above, at least one homeserver implementation is also not implementing +the spec as it is today. -It also turns out that the current implementation of refresh tokens in Synapse breaks the transaction ID semantics -already and needs to be [fixed](https://github.com/matrix-org/synapse/issues/15141). +It also turns out that the current implementation of refresh tokens in Synapse +breaks the transaction ID semantics already and needs to be +[fixed](https://github.com/matrix-org/synapse/issues/15141). ### Make a backwards compatible change A backwards compatible alternative could be something like: -1. For idempotency have clients opt-in to a new scope of transaction ID, but support the current semantics too for compatibility -2. Have clients opt-in (e.g. request param on the sync endpoint) to receiving transaction ID for all events in the sync response and make the client responsible for identifying which messages they sent +1. For idempotency have clients opt-in to a new scope of transaction ID, but +support the current semantics too for compatibility +2. Have clients opt-in (e.g. request param on the sync endpoint) to receiving +transaction ID for all events in the sync response and make the client +responsible for identifying which messages they sent -The disadvantage of this is that we create a load of extra maintenance work to support both semantics for a period of -time for (empirically) no gain in return. +The disadvantage of this is that we create a load of extra maintenance work to +support both semantics for a period of time for (empirically) no gain in return. ## Security considerations -A malicious client can adopt an existing device ID of a user. This could possibly allow some kind of denial of service attack. +A malicious client can adopt an existing device ID of a user. This could +possibly allow some kind of denial of service attack. -However, if such an attack where possible it would be possible to do so without this MSC as device IDs are crucial to -the implementation of end-to-end encryption. +However, if such an attack where possible it would be possible to do so without +this MSC as device IDs are crucial to the implementation of end-to-end encryption. ## Other recommendations -I'm not suggesting that these recommendations are address in this proposal, but more notes for future proposals or spec clarifications. +I'm not suggesting that these recommendations are address in this proposal, but +more notes for future proposals or spec clarifications. ### Clarification on idempotency semantics -I have separately prepared a [spec PR](https://github.com/matrix-org/matrix-spec/pull/1449) to clarify some of the -idempotency semantics that doesn't modify the spec but is useful to understand the context of this proposal. +I have separately prepared a [spec PR](https://github.com/matrix-org/matrix-spec/pull/1449) +to clarify some of the idempotency semantics that doesn't modify the spec but is +useful to understand the context of this proposal. ### Clarification on transaction ID time scope -I also suggest that the spec be clarified over what time periods the transaction ID is scoped for such that clients can -be aware. This cloud simply be to say that the time period is not defined and so may vary by implementation. +I also suggest that the spec be clarified over what time periods the transaction +ID is scoped for such that clients can be aware. This cloud simply be to say +that the time period is not defined and so may vary by implementation. ### Suggest a lease naive transaction ID format -I think we should sense to update the recommendation on the format of transaction IDs from: +I think we should sense to update the recommendation on the format of +transaction IDs from: > how is not specified; a monotonically increasing integer is recommended -To something more unique (in my research I have found some clients use a seconds since epoch which doesn't seem ideal) +To something more unique (in my research I have found some clients use a seconds +since epoch which doesn't seem ideal) ## Unstable prefix From 4b939809d6f8109106bc752342377187ffd24f85 Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Tue, 28 Feb 2023 18:09:20 +0000 Subject: [PATCH 3/8] Correction to claims from Synapse --- proposals/3970-device-scope-txnid.md | 25 +++---------------------- 1 file changed, 3 insertions(+), 22 deletions(-) diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md index 15289908443..057aaba6b70 100644 --- a/proposals/3970-device-scope-txnid.md +++ b/proposals/3970-device-scope-txnid.md @@ -77,10 +77,10 @@ that contains a rolling 24 hour window of Having analysed the contents of the table, it appears that there are no repeated transaction IDs for a given user, token and room. -There are other `PUT` endpoints for which transaction IDs are not in the -`event_txn_id` table, however because the event +n.b. not all `PUT` endpoints contribute to the table but I think the high-volume +ones do -As such, the widening of the scope from token to device would not have caused +As such, the widening of the scope from token to device should not have caused any issues during the periods sampled. For reference the following is the schema of the `event_txn_id` table: @@ -124,25 +124,6 @@ I can't find a related issue [listed](https://gitlab.com/famedly/conduit/-/issue so presumably this non-compliant behaviour isn't causing a known issue for admins and users of the Conduit homeserver? -#### 3. Synapse homeserver only checks for retransmits over a 30-60 minute window - -The Synapse homeserver uses an in-memory cache to check for idempotency of -requests. The cache is vacated after -[30-60 minutes](https://github.com/matrix-org/synapse/blob/adac949a417d064958039ae0918b97388413c824/synapse/rest/client/transactions.py#L50-L52) -and is not persisted between restarts (or across a cluster). - -This means that for many existing deployments idempotency is only actually -enforced over a 30-60 minutes and not eternally as the spec might suggest. - -#### 4. Synapse homeserver only supports local echo for the previous 24 hours - -The Synapse homeserver only supports local echo (by the presence of -`transaction_id` on sync responses) for the previous 24 hours. This is because -the `event_txn_id` table is only kept for 24 hours. - -Again, this suggests that in reality the local echo semantics are not preserved -eternally as the spec might suggest. - ### Is the "device" concept the right level of abstraction to use? One way to look at it is that device is already widely used in the end-to-end From 7b5d34378ad97063f4663a2b698e9e70798a5a61 Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Wed, 1 Mar 2023 13:49:42 +0000 Subject: [PATCH 4/8] Apply suggestions from code review Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> --- proposals/3970-device-scope-txnid.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md index 057aaba6b70..7daeac76015 100644 --- a/proposals/3970-device-scope-txnid.md +++ b/proposals/3970-device-scope-txnid.md @@ -61,16 +61,15 @@ with it, this proposal would widen the scope of the transaction ID. Therefore it could potentially lead to a request being treated as "new" where before it would have been identified as a retransmission and deduplicated. -However, the evidence suggests that nothing would be impacted in reality. +However, some evidence has been collated which suggests that nothing would be impacted +in reality: -Some evidences have been collated to support this claim: +#### 1. Data from the matrix.org homeserver suggests the change would have no impact -#### 1. Data from the matrix.org Homeserver suggests the change would have no impact +The `matrix.org` homeserver is a reasonable size deployment and could be considered +reasonably representative of the diversity of Matrix clients. -The matrix.org Homeserver is a reasonable size deployment and could be considered -reasonably representative of the diversity of Matrix client. - -The Synapse homeserver that runs matrix.org maintains a `event_txn_id` table +Synapse maintains a `event_txn_id` table that contains a rolling 24 hour window of (`user_id`, `token_id`, `room_id`, `txn_id`) tuples. @@ -80,7 +79,7 @@ transaction IDs for a given user, token and room. n.b. not all `PUT` endpoints contribute to the table but I think the high-volume ones do -As such, the widening of the scope from token to device should not have caused +This suggests that the widening of the scope from token to device would not have caused any issues during the periods sampled. For reference the following is the schema of the `event_txn_id` table: From 340908d58ee301382b971d9573c153b390f6d5ab Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Wed, 1 Mar 2023 14:18:01 +0000 Subject: [PATCH 5/8] Apply suggestions from code review Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> --- proposals/3970-device-scope-txnid.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md index 7daeac76015..84dea32ee51 100644 --- a/proposals/3970-device-scope-txnid.md +++ b/proposals/3970-device-scope-txnid.md @@ -55,11 +55,16 @@ determined from the access token. The main "issue" I see with this proposal is that this is technically a breaking change to the spec. -Because a device ID could have multiple sequences of access tokens associated -with it, this proposal would widen the scope of the transaction ID. +A device ID could have multiple sequences of access tokens associated +with it (since device ID can be specified as a parameter to +[`/login`](https://spec.matrix.org/v1.6/client-server-api/#post_matrixclientv3login)). -Therefore it could potentially lead to a request being treated as "new" where -before it would have been identified as a retransmission and deduplicated. +For example, a "bot" implementation might masquerade as the same "device" despite +calling `/login` on every run by passing the same device ID. Such a device might +also use the same transaction ID on each run. + +Therefore it could potentially lead to a request being treated as a duplicate +where previously it would have been treated as a distinct request. However, some evidence has been collated which suggests that nothing would be impacted in reality: From b861d41e990f8d258f1980164e9a8e4b68f54f20 Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Mon, 3 Apr 2023 15:04:25 -0400 Subject: [PATCH 6/8] Update proposals/3970-device-scope-txnid.md Co-authored-by: Hubert Chathi --- proposals/3970-device-scope-txnid.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md index 84dea32ee51..1b6036e8dae 100644 --- a/proposals/3970-device-scope-txnid.md +++ b/proposals/3970-device-scope-txnid.md @@ -180,7 +180,7 @@ this MSC as device IDs are crucial to the implementation of end-to-end encryptio ## Other recommendations -I'm not suggesting that these recommendations are address in this proposal, but +I'm not suggesting that these recommendations are addressed in this proposal, but more notes for future proposals or spec clarifications. ### Clarification on idempotency semantics From 82e0883d889e16f544d0c8499a46b493b0442de4 Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Wed, 5 Apr 2023 11:09:15 +0100 Subject: [PATCH 7/8] Make section on recommended transaction ID format make sense --- proposals/3970-device-scope-txnid.md | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md index 1b6036e8dae..88eda966178 100644 --- a/proposals/3970-device-scope-txnid.md +++ b/proposals/3970-device-scope-txnid.md @@ -56,8 +56,8 @@ The main "issue" I see with this proposal is that this is technically a breaking change to the spec. A device ID could have multiple sequences of access tokens associated -with it (since device ID can be specified as a parameter to -[`/login`](https://spec.matrix.org/v1.6/client-server-api/#post_matrixclientv3login)). +with it (since device ID can be specified as a parameter to +[`/login`](https://spec.matrix.org/v1.6/client-server-api/#post_matrixclientv3login)). For example, a "bot" implementation might masquerade as the same "device" despite calling `/login` on every run by passing the same device ID. Such a device might @@ -195,15 +195,23 @@ I also suggest that the spec be clarified over what time periods the transaction ID is scoped for such that clients can be aware. This cloud simply be to say that the time period is not defined and so may vary by implementation. -### Suggest a lease naive transaction ID format +### Recommend a less naive transaction ID format -I think we should sense to update the recommendation on the format of -transaction IDs from: +Currently the format of a transaction ID is not specified, but a recommendation +is [given](https://spec.matrix.org/v1.6/client-server-api/#transaction-identifiers): -> how is not specified; a monotonically increasing integer is recommended +> After the [previous] request has finished, the {txnId} value should be changed +> (how is not specified; **a monotonically increasing integer is recommended**). -To something more unique (in my research I have found some clients use a seconds -since epoch which doesn't seem ideal) +I think this is an unnecessarily naive recommendation. + +In most clients environments a pseudo-random number generator will be available and so +could be used to generate a UUID/ULID or similar. + +As an aside, in my research I have found some clients use a "seconds since epoch" as a +transaction ID which introduces a limit on the maximum possible event transmission rate +per room to once per second. Perhaps a better recommendation could help prevent the +such behaviour being introduced in future. ## Unstable prefix From 6fc4975c52ec3ed72b9234aa01c81551850c840d Mon Sep 17 00:00:00 2001 From: Hugh Nimmo-Smith Date: Fri, 14 Apr 2023 10:02:00 -0400 Subject: [PATCH 8/8] Make it clearer in the intro section what the problem is --- proposals/3970-device-scope-txnid.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/proposals/3970-device-scope-txnid.md b/proposals/3970-device-scope-txnid.md index 88eda966178..33c6024f6e9 100644 --- a/proposals/3970-device-scope-txnid.md +++ b/proposals/3970-device-scope-txnid.md @@ -14,6 +14,11 @@ The spec [reads](https://spec.matrix.org/v1.6/client-server-api/#transaction-ide > client logs out and back in between the `A` and `B` tokens, `TXN1` could be used > once for each. +The "client session" scope concept described can be complicated to implement. + +This MSC proposes that the scope of a transaction identifier is changed to something +that is easier to implement whilst maintaining required transaction semantics. + The transaction IDs appear in two parts of the Client-Server API spec: 1. As a identifier to allow the homeserver to make some `PUT` endpoints