Skip to content

Commit 4ee990e

Browse files
authored
Merge pull request #2758 from matrix-org/rav/proposals/textual_identifier_grammar
MSC2758: Proposal for a common identifier grammar
2 parents deaa82c + 49ce93f commit 4ee990e

File tree

1 file changed

+56
-0
lines changed

1 file changed

+56
-0
lines changed
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# MSC2758: Common grammar for textual identifiers
2+
3+
The matrix specification uses textual identifiers for a wide range of
4+
concepts. Examples include "event types" and "room versions".
5+
6+
In the past, these identifiers have often lacked a formal grammar, leaving
7+
servers and clients to make assumptions about questions such as which
8+
characters are permitted, minimum and maximum lengths, etc.
9+
10+
This proposal suggests a common grammar which can be used as a basis for
11+
*future* identifier types, to reduce the work involved in future specification
12+
work.
13+
14+
No attempt is made here to bring existing identifiers into line; however
15+
examples of identifiers which might have benefitted from such a grammar in the
16+
past include:
17+
18+
* [`capabilities`](https://matrix.org/docs/spec/client_server/r0.6.0#get-matrix-client-r0-capabilities)
19+
identifiers.
20+
* authentication types for the [User-Interactive Authentication mechanism](https://matrix.org/docs/spec/client_server/r0.6.0#user-interactive-authentication-api).
21+
* login types for [`/_matrix/client/r0/login`](https://matrix.org/docs/spec/client_server/r0.6.0#post-matrix-client-r0-login).
22+
* event types
23+
* [`m.room.message` `msgtypes`](https://matrix.org/docs/spec/client_server/r0.6.0#m-room-message-msgtypes)
24+
* `app_id` for [`POST /_matrix/client/r0/pushers/set`](https://matrix.org/docs/spec/client_server/r0.6.0#post-matrix-client-r0-pushers-set).
25+
* `rule_ids`, `actions` and `tweaks` for [push rules](https://matrix.org/docs/spec/client_server/r0.6.0#push-rules).
26+
* [E2E messaging algorithm names](https://matrix.org/docs/spec/client_server/r0.6.0#messaging-algorithm-names).
27+
28+
## Proposal
29+
30+
We define a "common namespaced identifier grammar". This can then be referenced
31+
by other parts of the grammar, in much the same way as [Unpadded
32+
Base64](https://matrix.org/docs/spec/appendices#unpadded-base64) is defined
33+
today.
34+
35+
The grammar is defined as follows:
36+
37+
* An identifier may not be less than one character or more than 255 characters
38+
in length.
39+
* Identifiers must start with one of the characters `[a-z]`, and be entirely
40+
composed of the characters `[a-z]`, `[0-9]`, `-`, `_` and `.`.
41+
* Identifiers starting with the characters `m.` are reserved for use by the
42+
formal matrix specification.
43+
* Implementations wishing to implement unspecified identifiers should follow
44+
the Java Package Naming convention of starting with a reversed domain
45+
name (with a dot after the domain name part). For example, for the
46+
organisation `example.com`, a valid identifier would be
47+
`com.example.identifier`.
48+
49+
This grammar is intended for use entirely by internal identifiers, and *not*
50+
for user-visible strings.
51+
52+
### Rationale
53+
54+
* Avoiding non-ascii characters sidesteps any issues with homoglyphs or
55+
altenative encodings of the same characters.
56+
* Avoiding upper-case character sidesteps any concerns over case-sensitivity.

0 commit comments

Comments
 (0)