MSC4421: Standardize the spec on US English#4421
MSC4421: Standardize the spec on US English#4421Johennes wants to merge 1 commit intomatrix-org:mainfrom
Conversation
Signed-off-by: Johannes Marbach <n0-0ne+github@mailbox.org>
b4e32a2 to
cbbe0ed
Compare
There was a problem hiding this comment.
Implementation requirements:
- Foundation (broader than SCT) review.
| @@ -0,0 +1,93 @@ | |||
| # MSC4421: Standardize the spec on US English 🇺🇸 | |||
There was a problem hiding this comment.
I'm weakly against this change.
From what I can see, the rationale is:
- searchability
- how we relate Matrix to other technical standards
- consistency
Consistency could go either way so isn't a convincing argument. Searchability is a weird one: I personally do not CTRL+F "authorisation server" then get sad that it's actually "authorization server". How we relate Matrix to other technical standards would be a compelling argument were it not for the later comparison that ISO is British English and RFCs are either. ANSI and IEEE would obviously be American English because they were founded in the States. W3C is a weird one given Berners-Lee and CERN, but it was established... in the States.
Matrix was notably not established in the States, so it's not unreasonable for it not to follow American English. The spec's recommendation of British English pretty much settles it for me, if anything we should be more strongly enforcing it. Enforcement itself isn't an argument since after all, even if we did follow American English we would still need to strongly enforce it for consistency.
All considered, there isn't enough here to overcome inertia imo.
There was a problem hiding this comment.
ANSI and IEEE would obviously be American English because they were founded in the States. W3C is a weird one given Berners-Lee and CERN, but it was established... in the States.
Matrix was notably not established in the States, so it's not unreasonable for it not to follow American English.
I hadn't looked at it this way before but if you think of the choice as an extension of heritage, that actually makes for a decent argument. This thing was invented in the UK. Therefore, it uses British English. Period.
The spec's recommendation of British English pretty much settles it for me, if anything we should be more strongly enforcing it.
The haziness is that the spec recommends it but we're actually doing the opposite in practice. There was a longer discussion in the Matrix Spec & Docs Authoring room when the OAuth APIs were introduced which ended with an en-US momentum which, at least in my experience, has then semi-officially been applied in spec PRs.
There was a problem hiding this comment.
The haziness is that the spec recommends it but we're actually doing the opposite in practice.
...which is why we need to enforce it. :)
A slight tangent: enforcement is unfortunately bureaucratic. I would strongly oppose having MSC authors or reviewers manually check for en-GB compliance because quite frankly, it's not a good use of human time imo. I'd much rather we enforced other things which can have a material impact on the protocol. When MSCs get converted into Spec prose, that seems like a good time to get out the en-GB spell checker and have a checklist item for conformity: this is still bureaucratic but it affects the spec writer rather than everyone.
There was a problem hiding this comment.
...which is why we need to enforce it. :)
I'd be totally fine with that option, too. My goal here is to force a decision to settle the current confusion and Americanization just seemed like the most likely outcome based on the previous chats.
Assuming we FCP-close this proposal and (actually) stick to British English, we should reiterate the house rules to explicitly exclude non-localizable terms and identifiers inherited from other standards. I think that would qualify as a clarification and shouldn't require an MSC itself.
A slight tangent: enforcement is unfortunately bureaucratic. I would strongly oppose having MSC authors or reviewers manually check for en-GB compliance because quite frankly, it's not a good use of human time imo. I'd much rather we enforced other things which can have a material impact on the protocol. When MSCs get converted into Spec prose, that seems like a good time to get out the en-GB spell checker and have a checklist item for conformity: this is still bureaucratic but it affects the spec writer rather than everyone.
Yes, agreed. I'm only concerned with the spec text here. Not proposals.
There was a problem hiding this comment.
I'm aligned with kegan's position here. The use of UK English reflects the language of those who wrote the spec in the first place, and I don't see enough of a reason to change that; instead we should improve the consistency - at least in the spec itself.
There was a problem hiding this comment.
All considered, there isn't enough here to overcome inertia imo.
The main argument for US English is that the spec currently has currently 54 instances of authorise (including authorisation, and similar), vs 165 instances of authorize, so I think the inertia is in favour of this MSC.
Much of the problem here is that we are naturally constrained by other specifications, notably OAuth2. We have to talk about concepts like an "authorization server", which is a defined concept in OAuth2. If we were writing in, say, German, then (I gather from native German speakers) we'd probably still call it an "authorization server" rather than ein "Autorisierungsserver" or something, so by extension we should probably do the same even if the body of the doc is en_GB. And of course the authorization_endpoint identifier is cast in stone because it's defined by RFC8414.
So we get into this whole question of where exactly we draw the line, which makes authoring and reviewing tricky, and it just goes away if we settle on en_US across the board.
There was a problem hiding this comment.
[...] and it just goes away if we settle on en_US across the board.
Maybe the question is whether we expect this to stay true in future. If we switch to en_US but then integrate with another standard that uses en_GB, we'll be back to the same problem. I cannot say how likely that is. It looks like the most probable cause would be an RFC that uses en_GB. There seem to be fairly few such RFCs around though. From a quick search I've only found RFC1484, RFC1781 and RFC2076 – all of which use "organisation". So maybe this is not very likely to happen after all.
There was a problem hiding this comment.
Yeah, for better or worse the majority of specs seem to be in en_US
There was a problem hiding this comment.
I've now come across https://auth0.com/fr/intro-to-iam/what-is-oauth-2. If they can talk about "Serveur d'autorisation" (for authorization server) and "attribution de code d'autorisation" (for authorization code grant), then I guess there's no reason we can't spell those terms with an s.
|
|
||
| Matrix has a huge center of mass in Europe. In a time of transatlantic tension, committing to the | ||
| American spelling might feel uncomfortable to some. Language and politics should not be conflated, | ||
| however. |
There was a problem hiding this comment.
"should not" and yet it is conflated, you can't avoid that. There's been plenty of high profile cases in the tech world:
- master/slave => leader/follower
- blacklist => blocklist
- master branch => main branch
All of these changes had to overcome inertia in order to happen. I wouldn't dismiss the impact of politics on choice of language, especially when there isn't a compelling reason to fall into one or the other.
|
|
||
| We could enforce the British spelling in spec text and identifiers that are not inherited from other | ||
| standards. To aid searchability, a legend of common words that differ in spelling could be included | ||
| at the bottom of each page. |
There was a problem hiding this comment.
What do RFCs do, as it seems like they would hit this the most due to allowing both?
There was a problem hiding this comment.
I think nothing. Searchability might not be as big of a problem for them given that RFCs get their own pages and search engines appear to be smart enough to even out the spelling differences.
|
Edit: apparently I was particularly cranky yesterday when I first wrote this comment. Now updated to say what I meant to say. As a side-bar: we tend to discuss this sort of thing in the #matrix-docs:matrix.org room. We'd welcome voices from people with opinions on things like the grammar in the spec to help us stay aligned! |
There was a problem hiding this comment.
For interest/reference, I created a PR bringing the spec into line with the current documentation style (i.e. en_GB), as far as the word "authori[zs]ation" goes: matrix-org/matrix-spec#2351
There was a problem hiding this comment.
There was a discussion in the internal Spec Core Team room about this MSC.
@richvdh was initially concerned about referencing terms from other specs with slightly different spelling (OAuth spec defines "authorization server", "authorization grant", etc.). But that concern abated after reading https://auth0.com/fr/intro-to-iam/what-is-oauth-2, which just translates the terms to French (authorization code grant -> attribution de code d'autorisation). This appears to be fine in practice.
@anoadragon453 initially said that they were indifferent on British English vs. US English being used for prose, but then conceded that US English would be better from a technical standpoint, as most other internet-defining specs are written in, or default to, US English (IETF RFCs, WHATWG, W3G, Khronos, etc.). So to avoid time-wasting footguns in the future, that was likely the easiest to work with for the Matrix spec as well.
Rendered