-
Notifications
You must be signed in to change notification settings - Fork 411
MSC4345: Server key identity and room membership #4345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
MSC4345: Server key identity and room membership #4345
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation requirements:
- Server (preferably multiple)
- Client (preferably multiple)
- Complement tests
This MSC is inspired the work of @kegsay in | ||
[MSC4243](https://github.com/matrix-org/matrix-spec-proposals/pull/4243). | ||
|
||
## Proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should explain that we allow anyone with invite
to set participation to permitted
. But only people with ban
can modify the participation to denied
once a server has set their own participation from permitted
to accepted
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this so that the proposal works well for private rooms and public rooms.
2. If `partcipation` is `denied`: | ||
1. If the `sender`'s power level is greater than or equal to the _ban level_, | ||
is greater than or equal to the target server's ambient power level, allow. | ||
2. Otherwise, reject. | ||
3. If `participation` is `permitted`: | ||
1. If the _target server_'s current participation state is `accepted`, reject. | ||
4. If the _target server_'s current participation state is `denied`: | ||
1. If the origin of the current participation state is the target key, reject[^revocation]. | ||
2. If the `sender`'s power level is less than the _ban | ||
level_ or is less than the target server's ambient power | ||
level, reject. | ||
5. if the `sender`'s power level is greater than or equal to | ||
the _invite level_, allow. 3. Otherwise, reject. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reusing the invite
and ban
permission could be inappropriate since this is equivalent to managing server ACL's in Matrix classic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invite
is probably ok to reuse since those are equivalent... ban
isn't though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about this actually.. it might not matter. ACL was always more risky because it can have irreversible efects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
denied
has now been changed to be equivalent to revocation. So there will be disruptive effects for any room version that doesn't incorporate MSC4348 or something like it. Ie, in a version where users are resident to a server key (rather than serverless), those users will all have to be rejoined to a room.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs reworking anyways, see #4345 (comment)
|
||
Please suggest specific algorithms to make this consistent. | ||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Server implementers should only allow server admins to revoke keys, and shouldn't let any user of theirs in a room to do it.
override this, and even if server admins set the server to deny, | ||
the key owner can still revoke the key. | ||
|
||
### The `/request_participation` endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subtext for this endpoint is that the requested server needs to create the participation event with one of its users. There may be objections to this, but in terms of client UI we're close enough with restricted join already in that the event says "bob joined via alice's server". So i'm not sure that argument alone can stand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs reworking see #4345 (comment)
reproducibility and preemptive access control for servers without the | ||
use of a policy server. | ||
|
||
### The `m.server.participation` state event, `state_key: ${origin_server_key}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to steal more prose from MSC4243 to describe the exact format for the server key and then use that consistently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I found this proposal quite hard to parse due to the addition of unfamiliar terminology which is inadequately described. Some terminology seems to not be described at all e.g "target server's ambient power level" where said terminology appears in auth rules.
- I found the auth rule changes hard to parse due to what I suspect is incorrect indentation which reads to me as dangling if statements.
- I think the proposal is actually two proposals, one to handle soft-failure in a more consistent manner, and one to make it the room admins job to verify domains. I'm unsure why the author is combining them; it just increases the chance of the proposal being slowed down / rejected due to trying to do too much. One part of the proposal could have approval but because the other part doesn't, the whole thing gets blocked.
- It's unclear what purpose
advertised_domain
serves, and how servers and clients are supposed to use it. I believe the intention is that participating servers verify the server key by talking to thatadvertised_domain
, but this isn't clear from the proposal.
EDIT: this comment previously mentioned that domain-to-key mappings were controlled either by any server or only by privileged server, and thus it either was useless (any server could make the mappings) or centralised (only admins could), neither of which are desirable. It turns out that this proposal makes no domain-to-key mappings at all, and allows any server to admit server public keys if they are already joined to the room. As a result of lacking domain-to-key mappings, I'm unsure how this provides any real traceability guarantees, given this is mentioned as a key differentiator with MSC4243.
Related concerns:
`denied`. `participation` is protected from redaction. | ||
|
||
A denied server must not be sent a `m.server.participation` event | ||
unless the targeted server is already present within the room. This is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're just stating existing behaviour (we don't tell servers who aren't in the room that we're talking about them)? If so, it's really confusing because it makes it seem like this is a really important part of the property, instead of just some edge case handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware that this was existing behaviour. Even if it is, it seems like it is a really important part of the property right?
Update: The MSC has been clarified substantially and more concepts that were described implicitly in the authorization rules have been explained explicitly. There is still work to be done to fix the auth rules, figure out what to do with the soft failure semantics and the introduced terminology for that, and probably a few other concerns. |
Matrix rooms. Whereas MSC4243 only does so for individual user | ||
accounts. | ||
|
||
However, critically this MSC provides traceability to the origin of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually true? Nothing is verified in this proposal, so as per my sleeper server example you still have to play whack-a-mole with any malicious server. The only advantage this proposal provides is the implicit invite to public rooms, so you could feasibly do some investigation to see which server is letting in these other servers. It's probably not safe to actually take action against those servers though because that could be weaponsied (e.g I make a domain and valid server, I join via the victim server, then nuke the domain and spam freely, such that when investigators see who invited me they think the victim server is colluding when it isn't).
The exciting part of this proposal to me was the idea that this would provide better traceability to the origin of users, but sadly I just don't see how this proposal does this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So let's be specific. The sleeper server would have to have the invite permission. So the rooms where this can happen are going to be ones set to invite only AND where it is a choice to have the invite be the default power level (this is true for rooms created with the private
preset I believe). If someone does this, their key can be revoked, and all the keys that they added / others added. So it is a game of wack-o-mole until you change the power level for invite when you figure out you are being attacked. Traceability exists because we can still see which servers added which keys and deny them. This is not possible at all currently for rooms that do not use join-restricted to ensure that each join event is signed by an existing participant. Whether anyone has a valid domain/invalid/attempts impersonation is not strictly relevant because the denies happen with relation to the server keys and not a domain. And a domain isn't going to be shown to anyone unless they themselves have verified ownership.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should break away from the invite
power level though and require each participant to be explicitly granted this permission by a room admin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you'd hold the same standard for the invite power level in private rooms though? Since currently this same attack can happen in all matrix rooms where invite is default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(e.g I make a domain and valid server, I join via the victim server, then nuke the domain and spam freely, such that when investigators see who invited me they think the victim server is colluding when it isn't).
@kegsay I don't understand this bit. Why would investigators conclude that spam events are originating from the victim server? What evidence would they use to conclude that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never said originating, I said colluding. The victim server "let in" the spam server.
|
||
This allows both public and private rooms to benefit from DAG | ||
reproducibility and preemptive access control for servers without the | ||
use of a policy server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preemptive access control
I don't think this is true.
How can it be preemptive when any server can admit malicious servers without performing proper domain checks? In other words, how are the following two scenarios materially any different?
- There's a public room with honest servers.
- Many malicious servers join the room.
vs:
- There's a public room with honest servers.
- A malicious server joins the room who behaves correctly.
- Many malicious servers join the room via the first malicious server.
It certainly allow reactive access control as you can find out who let in the bad guys and kick them out, but this is after-the-fact.
Only policy servers can provide preemptive access control because all servers in the room have to trust it to perform the correct authorisation checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be preemptive when any server can admit malicious servers without performing proper domain checks? In other words, how are the following two scenarios materially any different?
They can't, they need the invite power level under the current proposal... please stop reading my proposal in bad faith.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunately meta, but: I don't think your proposal is being read in bad faith, @Gnuxie. It's best not to assume bad faith upon mere questions. These things are hard to reason about as it stands and it's quite possible that one may not be able to immediately draw all conclusions that follow from a set of premises. (I'm thinking of myself when I say this, for one.) Please just answer the questions in good faith and let's move on.
In this instance, I don't think Kegan literally meant any server, so we can modify the claim to "any server with an invite power level" without removing the concern. I believe what Kegan is trying to say is that you can still have a situation where such a server deliberately doesn't check the domain -> server key
mapping and invites the server key into the room.
You can then notice this happened after the fact, but it contradicts the ability to preemptively deny access to the room for a particular domain. So in other words, this is not preemptive access control wrt to the domain. Is this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe what Kegan is trying to say is that you can still have a situation where such a server deliberately doesn't check the domain -> server key mapping and invites the server key into the room.
That is fine though. The advertised_domain
that is provided is not an attestation or proof of verification that the domain has been verified to anyone. At any time. And we don't provide any mechanism for that. That's not what the proposal does or intends to do.
Separate to the idea of servers as keys, we do propose local (to each server) verification of domain ownership in order to allow clients to show domains in UI rather than server keys. None of that information is derived from the DAG other than servers locally testing the advertised_domain
of accepted participations to attempt local verification of domain ownership through verifying a public key is advertised on the domain which is associated with the private key that has signed their participation event.
This process is described here https://github.com/Gnuxie/matrix-doc/blob/gnuxie/server-key-identity-and-room-membership/proposals/4345-server-key-identity-and-room-membership.md#changes-to-_matrixkeyv3query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can then notice this happened after the fact, but it contradicts the ability to preemptively deny access to the room for a particular domain. So in other words, this is not preemptive access control wrt to the domain. Is this wrong?
We don't provide it in regards to a domain, we provide it in regards to a server. Do we still allow political crisis within rooms because people with the relevant permissions who are trusted to be responsible for the room's administration can be malicious? Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't know about this. Annoyingly this does mean that room admins can't actually invite a server key, and it's not clear that they would be able to do that anyway given that they probably won't know which key a homeserver wants to use in this room. So the idea of permitting a homeserver to participate pre-emptively cannot be expressed through the m.server_participation
event but through other means (like inviting a user). And this signals that when the homeserver requests participation with a key it should be permitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution of this problem needs to be worked out through exploration of MSC4348 or a compromising MSC that works closely to the current room model (if such a compromising MSC can even exist under the terms of this proposal, which might have already been written off).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, i'm not sure we need to do all this. An invitation on the member level is enough. Then combined with a new power level for permit we don't actually give anyone with invite permission the ability to add an infinite number of keys. The only issue is that we probably still need a request step so that the joining server has total control over it's unverified_domain
property (from the start, and not just since accept). It means we can also remove the accept step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The requested state then needs to be tracable to the requesting server and the receiver of the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we'd have to modify /request_participation
to include something like restricted join's join_authorised_via_users_server
for the requested server's key. And we'd probably need the whole make_request, send_request handshake because for invite only rooms these will need to be authorized via an invitation.
@@ -0,0 +1,429 @@ | |||
# MSC4345: Server key identity and room membership |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a fundamental fork-in-the-road choice here, when comparing this MSC to ones like MSC4243 (per-user keys). Both MSCs turn the DAG into a self-verifying data structure which is not reliant on DNS. The key difference is how they do this.
- This MSC does this at the server level, introducing new state events to effectively represent "is this server key in the room?".
- MSC4243 does this at the user level, which doesn't need new state events.
Neither MSC verifies the claimed domain for each server key: this is allowed to split brain which results in ugly user IDs appearing on clients (@alice:server-key
for this MSC or @user-key:invalid
for MSC4243) which need to be patched up when the domain is verified, so both end up looking like @alice:domain
.
The fork in the road is ultimately which direction the protocol wants to go:
- If we want to double down on "Matrix is a federation protocol, users must have some chaperone server when sending events" then this proposal fits that ideal better than MSC4243 because it bakes that "chaperone server" semantics explicitly into the protocol with their own set of keys.
- If we want to double down on "Matrix is a peer-to-peer protocol, which happens to be currently be between servers but could be directly between users" then MSC4243 fits that ideal better because it promotes user keys to being the only key required to send valid events.
Note that this proposal can support P2P, by adding another layer of keys on top of the server keys (which is what MSC4348 does).
Federation protocols fundamentally need the server to know more information about their users, so it leaks more metadata than peer-to-peer protocols. For example, whilst this proposal can support P2P via MSC4348, it needs to know the PLs of all users on each server to know how to enforce the server participation event (the ambient power level in this MSC). Events also need to expose their sending node information in order to check the signatures on the "chaperone server/node". In contrast, P2P protocols can end up turning servers into store-and-forward nodes for encrypted data and that's it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to double down on "Matrix is a federation protocol, users must have some chaperone server when sending events" then this proposal fits that ideal better than MSC4243 because it bakes that "chaperone server" semantics explicitly into the protocol with their own set of keys.
This is not true. Please stop misrepresenting this proposal. #4348 shows how P2P matrix would work within this proposal. And it is a cleaner solution that does not tie accounts to any domain or server, MSC4243 does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also consider that you have had months to develop 4243 behind a placeholder in private. With support and consultation from others. I've written this proposal very quickly solo. Without praise or encouragement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still extend an invitation to you to develop this MSC with me collaboratively.
events through their users in both MSCs and we don't intend to change | ||
that in future MSCs in this series either. | ||
|
||
### Terminiology |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Terminiology | |
### Terminology |
|
||
This allows both public and private rooms to benefit from DAG | ||
reproducibility and preemptive access control for servers without the | ||
use of a policy server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunately meta, but: I don't think your proposal is being read in bad faith, @Gnuxie. It's best not to assume bad faith upon mere questions. These things are hard to reason about as it stands and it's quite possible that one may not be able to immediately draw all conclusions that follow from a set of premises. (I'm thinking of myself when I say this, for one.) Please just answer the questions in good faith and let's move on.
In this instance, I don't think Kegan literally meant any server, so we can modify the claim to "any server with an invite power level" without removing the concern. I believe what Kegan is trying to say is that you can still have a situation where such a server deliberately doesn't check the domain -> server key
mapping and invites the server key into the room.
You can then notice this happened after the fact, but it contradicts the ability to preemptively deny access to the room for a particular domain. So in other words, this is not preemptive access control wrt to the domain. Is this wrong?
Update: auth rules have been cleaned up and references to MSC4349 to explain casual barriers have been made. |
This proposal encodes a special auth rule for `denied` participation to | ||
avoid soft failure and the problems discussed in MSC4104. | ||
|
||
## Security considerations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please allow keys to be scoped to a number of events. E.g. 1000 before rotation is required. See also #4353
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth requiring that authorization events have a separate limit to normal events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth doing the hard work of specifying keys as capabilities omg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the awesome secure Matrix you can have! But you don't!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably scope creep and should be a follow up MSC, see the-draupnir-project/planning#51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a note in alternatives or issues before closing the thread.
We can't allow room admins to "unban" servers because it might allow them to make use of stolen keys. See security considerations. We're not confident in any other solutions to this.
Make it explicit that this domain is unverified.
context of statments is lost. Where authorization rules are | ||
inconsistent this text takes precedence. | ||
|
||
- Reminder: In this MSC _Server_ refers to the controller of a ed25519 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should make this more consistent by using server key controller and homeserver deployment.
Rendered
Signed-off-by: Gnuxie [email protected]