- 
                Notifications
    You must be signed in to change notification settings 
- Fork 413
MSC4099: Participation based authorization for servers in the Matrix DAG #4099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Draft
      
        
      
            Gnuxie
  wants to merge
  12
  commits into
  matrix-org:main
  
    
      
        
          
  
    
      Choose a base branch
      
     
    
      
        
      
      
        
          
          
        
        
          
            
              
              
              
  
           
        
        
          
            
              
              
           
        
       
     
  
        
          
            
          
            
          
        
       
    
      
from
Gnuxie:gnuxie/server-participation
  
      
      
   
  
    
  
  
  
 
  
      
    base: main
Could not load branches
            
              
  
    Branch not found: {{ refName }}
  
            
                
      Loading
              
            Could not load tags
            
            
              Nothing to show
            
              
  
            
                
      Loading
              
            Are you sure you want to change the base?
            Some commits from the old base branch may be removed from the timeline,
            and old review comments may become outdated.
          
          
  
     Draft
                    Changes from all commits
      Commits
    
    
            Show all changes
          
          
            12 commits
          
        
        Select commit
          Hold shift + click to select a range
      
      f80a042
              
                Hmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
              
              
                Gnuxie c2507db
              
                Now we need to figure out how banning works.
              
              
                Gnuxie 6339885
              
                General edit before removing capability considerations
              
              
                Gnuxie fa19c21
              
                Remove capabilities consideration.
              
              
                Gnuxie 4ff4ffb
              
                Rename for MSC number
              
              
                Gnuxie 1ae7ed4
              
                4099
              
              
                Gnuxie 8de04f1
              
                Add issue with receiving a PDU from an unjoined room
              
              
                Gnuxie df96811
              
                A general edit
              
              
                Gnuxie 24d814c
              
                aspell
              
              
                Gnuxie d3930aa
              
                clobbered myself with aspell
              
              
                Gnuxie fa09e37
              
                Update the abstract for skim readers
              
              
                Gnuxie f72a4c8
              
                introduction tweaks
              
              
                Gnuxie File filter
Filter by extension
Conversations
          Failed to load comments.   
        
        
          
      Loading
        
  Jump to
        
          Jump to file
        
      
      
          Failed to load files.   
        
        
          
      Loading
        
  Diff view
Diff view
There are no files selected for viewing
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              | Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,173 @@ | ||
| # MSC4099: Participation based authorization for servers in the Matrix DAG | ||
|  | ||
| This is a proposal for the representation of servers and their basic responsibilities in the Matrix | ||
| DAG. This MSC does not define or amend a state resolution algorithm, since there are several possible | ||
| routes that can be explored with other MSCs. We make considerations to allow this proposal to be | ||
| implemented on top of the existing `m.room.member`/`m.room.power_levels` centric authorization and | ||
| state resolution algorithms. | ||
|  | ||
| The key merits of this proposal are: | ||
| - Authorization rules are changed so that there is no way for a server to append an event to the DAG | ||
| until they have been explicitly named in a new authorization event, `m.room.server.participation`. | ||
| - All events have to be authorized with a corresponding `m.room.server.participation` event for their | ||
| origin server. | ||
| - Room admins and their tools now have the ability to examine joining servers before making a decision | ||
| to permit them to participate in the room. This can be thought of as the equivalent of "knocking for servers"[^knocking]. | ||
| - We envisage that for most rooms, permitting servers to participate will happen quickly and automatically, | ||
| probably before the server even attempts to join the room if they already well known and trusted within | ||
| the community[^tooling-for-accepting]. | ||
|  | ||
| Additional merits that can be explored as an indirect result of this proposal: | ||
| - A way for servers to preemptively load and cache rooms that their users are likely to join. | ||
| - A way for servers to advertise to other servers about rooms that their users are likely to join, | ||
| so that these rooms can be optionally pre-loaded and cached. | ||
|  | ||
| This is a more specific component and redesign of the general idea of [MSC3953: Server capability DAG](https://github.com/Gnuxie/matrix-doc/blob/gnuxie/capability-dag/proposals/3953-capability-dag.md). | ||
| There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this replace #3953? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think so, and I don't know yet. Is that alright? | ||
|  | ||
| ## Context | ||
|  | ||
| ### The role of the existing `m.room.member` for a user | ||
|  | ||
| In order to develop ideas about how to represent server membership in the room DAG, | ||
| which is NOT what this proposal does, we need to understand the responsibilities that `m.room.member` | ||
| already has: | ||
|  | ||
| - A representation of the desire for the user's server to be informed of new events. | ||
| - The capability for the server to participate in the room on behalf of the user, | ||
| being used to authorize the user's events. | ||
| - The capability for the user to backfill in relation to visibility. | ||
| + it is unclear to me whether `m.room.history_visibility` restricts a server's ability to backfill or not. | ||
| - A representation of the user's profile and participation information, who they are, why they are in the room, avatar, and displayname. | ||
|  | ||
| ## Proposal | ||
|  | ||
| ### Considerations for amending the `make_join` handshake | ||
|  | ||
| When a joining server is instructed by a client to join a room, the joining server sends an | ||
| EDU, `m.server.knock`, to any available resident server that the joining server is aware of. | ||
|  | ||
| The server then waits until it receives an `m.server.participation` event, which will contain the | ||
| joining server's name within the `state_key`. | ||
| The `m.server.participation` event can be received from any resident server that is participating | ||
| in the room. However, the `m.server.participation` event should only be sent by the room admins. | ||
|  | ||
| When `m.server.participation`'s `participation` field has the value `permitted`, then | ||
| the joining server can begin to use `make_join` and `send_join`. However, `send_join` could be amended | ||
| in another MSC so that a server is able to produce an `m.server.subscription` configuration event, | ||
| rather than an `m.room.member` event for a specific user. This is so that a server can begin the | ||
| process of joining the room in advance of a user accepting or joining the room via a client, | ||
| in order to improve the response time. | ||
|  | ||
| ### The: `m.server.knock` EDU | ||
|  | ||
| `m.server.knock` is an EDU to make a client in a resident server aware of the joining server's intent | ||
| to join the room. This client will usually be a room admin. A client can then arbitrarily research | ||
| the reputation of the joining server before deciding whether resident servers of the room should | ||
| accept any PDU whatsoever from the joining server. Currently in room V11 and below, it is not | ||
| possible for room operators to stop a new server from sending multiple PDUs to a room without first | ||
| knowing of, and anticipating a malicious server's existence. This is a fact which has already | ||
| presented major problems in Matrix's history. | ||
|  | ||
| This proposal does not just aim to remove the risk of spam joins for members from the same server, | ||
| but also spam joins from many servers at the same time. While it is seen as technically difficult | ||
| to acquire user accounts from a large number of Matrix homeservers, it is still possible and | ||
| has happened before. For example, servers could become compromised with a common exploit in a server | ||
| implementation. Existing servers that have weak registration requirements could also be exploited, | ||
| and this has happened already in Matrix's past. | ||
|  | ||
| Having an EDU allows us to accept a knock arbitrarily with clients, and more accurately automated bots | ||
| like Draupnir. We can then arbitrarily research the reputation of the server before deciding | ||
| to accept. This also conveniently keeps auth_rules around restricted join rules clean and simple, | ||
| because all logic can be deferred to clients. | ||
|  | ||
| The `m.server.knock` EDU can be treated as idempotent by the receiver, although the effect should | ||
| expire after a duration that is subjective to the receiver. | ||
|  | ||
| ``` | ||
| { | ||
| "content": { | ||
| "room_id": "!example:example.com", | ||
| }, | ||
| "edu_type": "m.server.knock" | ||
| } | ||
| ``` | ||
|  | ||
| ### The `m.server.participation` event, `state_key: ${serverName}` | ||
|  | ||
| This is a capability that allows the server named in the `state_key` to send `m.server.subscription`, | ||
| it is sent to accept the `m.server.knock` EDU. The event can also be used to make a server aware of | ||
| a room's existence, so that it can be optionally preload and cache a room before the server's users | ||
| discover it. | ||
|  | ||
| `participation` can be one of `permitted` or `deny`. When `participation` is `permitted`, the server | ||
| is able to join the room. When `participation` is `denied`, then the server is not allowed to send | ||
| any PDU's into the room. The denied server must not be sent a `m.server.participation` event unless | ||
| the targeted is already present within the room, or it has attempted to knock. | ||
| This is to prevent malicious servers being made aware of rooms that they have not yet discovered. | ||
|  | ||
| A `reason` field can be present alongside `participation` in order to explain the reason why | ||
| a server has been `denied`. This reason is to be shown to the knocking, or previously present | ||
| server, so that they can understand what has happened. | ||
|  | ||
| ### The `m.server.subscription` event, `state_key: ${serverName}` | ||
|  | ||
| This is a configuration event that uses the `m.server.participation` capability to manage | ||
| the server's subscription to the event stream. This is NOT an authorization event. | ||
|  | ||
| This is distinct from `m.server.participation` because this event is exclusively controlled | ||
| by the participating server, and other servers cannot modify this event[^spec-discussion]. | ||
| This allows the server to have exclusive control over whether it is to be sent events (where | ||
| its participation is still `permitted`). We specifically do not want to merge this with | ||
| `participation` to avoid having to specialise state resolution for write conflicts, | ||
| or "force joining" servers back into rooms. This allows a server to remain permitted to participate, | ||
| but opt out of receiving further events from this room, and can then optionally stop replicating the | ||
| room and delete all persistent data relating to it (should all clients have also forgotten the room). | ||
|  | ||
| ### Considerations for event authorization | ||
|  | ||
| All events that a server can send need to be authorized by an `m.server.participation` event | ||
| with the field `participation` with a value of `permitted`. | ||
|  | ||
| ## Potential issues | ||
|  | ||
| ### Permitting, then denying a malicious server. | ||
|  | ||
| The feature in principle that a malicious server can never send a PDU into the room can be worked | ||
| around if the server manages to have their `participation` `permitted` at some point in the room's | ||
| history. Since now they can create PDU's that reference this stale state, and all the other | ||
| participating servers have no option but to soft fail these events | ||
| (ignoring that we don't block them at the network level). | ||
| While this is still a huge improvement over the existing situation, we need suggestions for how | ||
| to stop this at the event authorization level. I'm begging for advice. | ||
|  | ||
| ### Unclear if a joining server can receive a PDU from a room that it is not joined to | ||
|  | ||
| The amendments to the join handshake described in this MSC mean that a server has to wait | ||
| for a PDU, `m.server.participation` before it has attempted to join the room beyond sending an EDU. | ||
| It's not clear to me whether this is currently possible or changes are required to federation send. | ||
|  | ||
| ### Surely the joining server needs to send the EDU via resident servers, so `make_join` has to be modified | ||
|  | ||
| The EDU `m.server.knock` surely has to be sent via a resident server so that it can be received | ||
| by all servers within the room. | ||
|  | ||
| ## Alternatives | ||
|  | ||
| ## Security considerations | ||
|  | ||
| ## Unstable prefix | ||
|  | ||
| ## Dependencies | ||
|  | ||
| None. | ||
|  | ||
| [^spec-discussion]: This was derived from the following spec discussion: https://matrix.to/#/!NasysSDfxKxZBzJJoE:matrix.org/$0pv9JVVKzuRE6mVBUGQMq44vNTZ1-l19yFcKgqt8Zl8?via=matrix.org&via=envs.net&via=element.io | ||
|  | ||
| [^knocking]: Although, knocking is implemented with the auth event `m.room.member` we don't want joining | ||
| servers to be able to send any event to the room at all (other than the `m.server.knock` EDU). | ||
|  | ||
| [^tooling-for-accepting]: Though now I say this, we probably need to be able to demonstrate that | ||
| this will be the case. A lot of this is now looking obvious, why weren't we thinking about this | ||
| years ago? Well, there's a lot of context. There always is buddy, you've got the easy view of hindsight. | ||
| Someone had to both conceive and write this and get us out of the dark ages. Ths MSC looks poorer | ||
| in my eyes by the minute. | ||
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unclear on what the actual problem being solved is here. I agree that because servers can spoof user events, you could model membership of DAGs via servers rather than users. However, shouldn't we be moving towards membership of rooms being tied more tightly to actual users & devices (i.e. cryptographically constrained group membership) rather than giving up and modelling participation just by servers?
In terms of the merits here:
Do we not get this via ACLs?
Hm, so this would let an AS on a server sniff traffic and somehow authorise it before it reaches other users in the room on that server?
I'm failing to follow how this works. Is the idea that joins can be delegated to a client on a target server so the client can decide whether the joiner server is allowed to join or not?
So is the idea here: "Provide a way for clients to subscribe to all events attempting to federate with the server, and authorise them before they enter the room DAG?". If so, I wonder if a less invasive mechanism could be used - effectively a standardised API to inspect events before federation rules kick in, rather than changing the entire concept of membership?
(This also reminds me a bit of pseudousers in #1777 and whatever travis is up to in #4049, in terms of letting servers act as a first-class citizen rather than requiring traffic to be linked to users. Historically this has not gone anywhere, though: the accountability of actually linking traffic to users rather than servers seems desirable).
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think this MSC is not incompatible with cryptographically constraining user membership. Since this MSC can still be used in a DAG that has user membership in addition to server participation. However, there is another conversation to be had over whether Matrix should be going in this direction. However, it is not relevant to this MSC, that discussion is more relevant to #3953. So as a digression, while I'm not prepared to argue in the opposite direction right now, I would at least like to explore the alternative, even if it is opposed to the goal of eliminating meta data etc. Since for large public Matrix rooms it's not exactly clear to me yet how device centric membership, and pseudo identity etc is going to protect against data farming/mining by joining lots of public rooms etc. At least not in isolation and when most people are going to have the same profile information in all public rooms. Since for this to be effective the way people are using Matrix would also have to change. From what I have seen so far, with these proposals you are designing a protocol that suits a use case that is entirely different to the way Matrix is being used by the community today. Again, I see this as a digression. And I concede that I could be grossly misinformed about this, and I'm yet to develop my thoughts more concretely here. This is not the purpose of the MSC.
No, and you know this as well as I do. Server ACL has been accidentally bypassed by server implementations not implementing it properly before, and with wider adoption of Matrix this will happen again. Additionally, there is no protection against a server deliberately leaking events between normal and malicious servers (in both directions), and no tooling to detect these leaks. There is also a limit to the number of servers that can be added to server ACL (around 512?), it really needs replacing. It requires all participating servers to be using the room in good faith, that is not going to be a reality forever.
No, this was specific to sniffing the
knockEDU during the revised server join handshake.Yes, the
knockEDU is used as a way to get the clients to see that a server is wanting to join the room, granted there could be another better way to do this.No, again, the proposal is to make servers send an EDU to the room when they want to join, so that clients (and therefore admins, or at least their tooling) can vet the servers before allowing them to join and be authorized to send any events. Once they have been vetted, all authorization is done via auth rules. The idea is that there isn't a way in the auth rules for a joining server to create a valid PDU until a special event
m.server.participationexists that references their server name. And this event has to be created by a room admin once they know the joining server exists. They can also pre-empt the existence of joining servers and setupm.server.participationevents for them in advance. I guess now that I have explained this as such, it does look a lot like an allow list, but with a mechanism for it to be automatically updated quickly enough not to cause too much disruption.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the introduction to the proposal since I think you read that and it mislead you. It was probably too abstract, my bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Those proposals are interesting though, I'll give them a look)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specification does not recommend that servers must (or even should) soft fail events from servers matching
m.room.server_acl'sdeny. So you can't even rely on leaks to be mitigated that way. The specification likely can't make that recommendation either, because you'd have to think about what that would imply for a new server joining the room that goes on to replicate the room's history afresh.