Skip to content

Commit fba345e

Browse files
committed
feat: add MiniMessage language specification
1 parent 9a08270 commit fba345e

File tree

5 files changed

+434
-21
lines changed

5 files changed

+434
-21
lines changed

astro.config.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -396,6 +396,7 @@ export default defineConfig({
396396
"adventure/minimessage/api",
397397
"adventure/minimessage/dynamic-replacements",
398398
"adventure/minimessage/translator",
399+
"adventure/minimessage/specification",
399400
],
400401
},
401402
"adventure/serializer/ansi",

ec.config.mjs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import { pluginCollapsibleSections } from "@expressive-code/plugin-collapsible-sections";
22
import { pluginLineNumbers } from "@expressive-code/plugin-line-numbers";
3+
import backusNaurHighlight from "./src/utils/shiki/bnf.tmLanguage.json" with { type: "json" };
34
import miniMessageHighlight from "./src/utils/shiki/mm.tmLanguage.json" with { type: "json" };
45

56
/** @type {import('@astrojs/starlight/expressive-code').StarlightExpressiveCodeOptions} */
@@ -14,6 +15,6 @@ export default {
1415
},
1516
emitExternalStylesheet: false,
1617
shiki: {
17-
langs: [miniMessageHighlight],
18+
langs: [miniMessageHighlight, backusNaurHighlight],
1819
},
1920
};
Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
---
2+
title: Language Specification
3+
slug: adventure/minimessage/specification
4+
description: A developer-facing specification of the MiniMessage format.
5+
tableOfContents:
6+
minHeadingLevel: 2
7+
maxHeadingLevel: 5
8+
---
9+
10+
This document outlines the MiniMessage format in detail to aid developers who wish to implement their own MiniMessage
11+
parser from scratch or understand the internal processes happening during the parsing of MiniMesssage formatted strings.
12+
13+
The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”,
14+
“MAY”, and “OPTIONAL” in this document are to be interpreted as described in
15+
[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.html).
16+
17+
## The MiniMessage language
18+
19+
The MiniMessage language is a markup format used for representing Minecraft's component-based text
20+
system in a human-readable and modifiable way. Broadly speaking, the language consists of two types
21+
of tokens: **plain text** and **tags**.
22+
23+
Plain text is any string. This string is UTF-16 compatible. The following is an example of a valid
24+
plain text part of a MiniMessage-formatted string:
25+
26+
```mm
27+
The MiniMessage format was made to be as simple as possible.
28+
Emojies are allowed 😅. So are japanese characters, like 紙.
29+
```
30+
31+
MiniMessage tags are primarily used for adding markup information to plain text parts. They can, however,
32+
also add entirely new content into the serialized component. The way how a tag is resolved makes no
33+
difference to the MiniMessage lexer. A tag has the following structure:
34+
35+
```mm
36+
<tagname flags named_argument=value :a sequenced argument:another one>
37+
```
38+
39+
A tag consists of the following parts:
40+
41+
- `< >`: All tags are surrounded by less than and more than symbols.
42+
- `tagname`: Every tag starts with the name. The name follows the list of characters mentioned as allowed
43+
in the [misc/identifiers](#identifiers) section.
44+
- Tags can have arguments. There are two distinctions between argument types: named and sequenced. Named arguments
45+
are, as the name implies, named in some way. Sequenced arguments do not have a name, instead they are a simple list
46+
of string values. [Tag argument documentation can be found later in the page](#tag-arguments).
47+
48+
### Tag syntax
49+
50+
MiniMessage tags can surround text.
51+
52+
```mm
53+
<tagname>Inner text</tagname> and outer text.
54+
```
55+
56+
Tags can be closed by repeating the tag, with a slash in front of the name. Tags are closed implicitly
57+
when the end of the string is reached. Furthermore, tags can be nested:
58+
59+
```mm
60+
<first_tag>Some text <second_tag>even more text</second_tag>, and that's really it!
61+
```
62+
63+
Nested tags are closed implicitly when the outer tag is closed.
64+
65+
```mm
66+
This text is unmarked <outer_tag>marked, <inner_tag>inner</outer_tag>, and again no longer marked.
67+
```
68+
69+
If a tag has arguments, these must not be repeated on the closing tag.
70+
71+
```mm
72+
<tagname:with an argument>Some text</tagname>
73+
```
74+
75+
Lastly, normal tags can be closed instantly by prepending a `/` to the more-than symbol of an opening tag.
76+
77+
```mm
78+
This tag is auto-closed: <tagname/>
79+
```
80+
81+
## Tag arguments
82+
83+
Arguments are placed between the tag name and the closing more-than symbol.
84+
85+
```mm
86+
<tagname [arguments here]>
87+
```
88+
89+
### Named argument types
90+
91+
Before each named argument, a piece of [whitespace](#whitespace) must be present.
92+
93+
There exists two types of named arguments: value-based arguments and flag arguments.
94+
95+
#### Flag argument type
96+
97+
Flags may be preceded by a single exclamation mark `!` and must follow the rules set by [identifiers](#identifiers).
98+
99+
```mm
100+
<tagname some_flag another_flag_4_you !inverted_flag>
101+
```
102+
103+
The following shows a tag with invalid flags:
104+
105+
```mm
106+
<tagname SomeCoolFlag !!double_inverted what-even-is-happening-here?>
107+
```
108+
109+
#### Valued argument type
110+
111+
Named arguments with a value consist of an identifier, an equal symbol `=`, and a value.
112+
113+
The identifier follows the rules as explained in the [misc/identifiers](#identifiers) section of this page.
114+
The value may consist of any UTF-16 characters, but must not contain any whitespace, unless explicitly quoted.
115+
Please refer to [misc/quoting](#quoting) for any specifics.
116+
117+
Here is an example for valid valued named arguments:
118+
119+
```mm
120+
<birdtag bird=parrot color='red and blue'>
121+
```
122+
123+
And example for invalid valued named arguments:
124+
125+
```mm
126+
<birbtag vöglein=papagei Colour=red and blue>
127+
```
128+
129+
:::note
130+
131+
The above tag, assuming the identifiers were valid, would actually parse both `and` and `blue` as flags.
132+
133+
:::
134+
135+
#### Combining flags and values
136+
137+
These two named types can be combined in any way.
138+
139+
```mm
140+
<combined_tag aflag some=value !inverted_flag really=yeah!>
141+
```
142+
143+
### Sequential arguments
144+
145+
Sequential arguments are declared at the end of the tag. Each sequential argument starts with a colon `:`.
146+
Unless named arguments are present, a whitespace before the first colon `:` is not necessary.
147+
148+
Sequential arguments may contain any UTF-16 characters. Any instances of `<`, `>`, or `:` characters
149+
must either be escaped (see [misc/escaping](#escaping)) or the argument must be wrapped in quotes
150+
(see [misc/quoting](#quoting)).
151+
152+
The following are valid MiniMessage tags with sequential arguments:
153+
154+
```mm
155+
<simpletag:hey there>
156+
157+
<another:first argument:second argument>
158+
159+
<with_whitespace :this is perfectly fine>
160+
161+
<nested_mm:\<some_cool_tag\> and a \: colon!>
162+
163+
<nested_mm:"<some_cool_tag> and a : colon, but it's quoted!">
164+
```
165+
166+
### Combining argument types
167+
168+
Named and sequential arguments can be used together. The general syntax looks as follows:
169+
170+
```mm
171+
<tagname[ named_arguments ][:sequenced:arguments]>
172+
```
173+
174+
All named arguments must be located between the tag name and the first non-value colon.
175+
176+
A few examples for valid tags making use of both named and sequenced arguments:
177+
178+
```mm
179+
<combined coolness=true flags :and sequenced args>
180+
181+
<combined flags !over more_flags and even !more flagss :I'd call this cool:Would you?:Yeah for sure>
182+
183+
<combined tic=tac :time's up!>
184+
```
185+
186+
## Misc
187+
188+
This section defines miscellaneous behavior of common parts.
189+
190+
### Identifiers
191+
192+
All identifiers must be lowercased and contain only alphanumerical characters or `_`. All identifiers
193+
used as named argument names should be unique.
194+
195+
### Quoting
196+
Argument values can be quoted. A value counts as quoted if the first character is a `'` or `"`. The quoted
197+
value ends as soon as another unquoted quote of the same character as the starting quote is found at the
198+
end of an argument.
199+
200+
Between the opening and the closing quote, any UTF-16 characters may be present. This also includes the same
201+
quote as used for the string. The following would be a valid tag;
202+
203+
```mm
204+
<tag:"double quoted", yet contains a double quote?">
205+
```
206+
207+
This is because the `"` in the middle is **not the last character of the value**. Therefore, it is read
208+
literally, since the tag would otherwise be invalid.
209+
210+
:::tip
211+
212+
As long as the quote is not closed, the lexer must continue reading characters. If the end of the
213+
input is reached before closing quote is found, the tag and any following characters should be
214+
read as plain text, as the tag is never closed. This is to aid users in finding the error in their syntax.
215+
216+
:::
217+
218+
### Whitespace
219+
220+
A whitespace character may be a classical space `\s`, a tab character `\t`,
221+
a newline `\n`, or a carriage return `\r`.
222+
223+
### Escaping
224+
225+
In MiniMessage, certain symbols, which would be interpreted differently by a lexer may be preceded by a backslash `\`
226+
to instead be included literally. This includes backslash `\` characters, if they would have any effect on the next
227+
symbol. If a backlash character had no effect, it is included literally.
228+
229+
## Formal grammar
230+
231+
This segment declares the formal grammar (in a flavor of the Backus-Naur form) which specifies the MiniMessage language.
232+
233+
The specific flavor used here changes that non-terminal symbols are no longer enclosed in angle brackets `<>`
234+
and the `::=` meta symbol is replaced by ``. Curly brackets `{}` declare optional parts. Lastly, a `+` suffix
235+
declares that a symbol should appear at least once, but may appear more often, whilst a `*` suffix declares that
236+
a symbol may appear once or more often.
237+
238+
```bnf
239+
; Important notes regarding this specific grammar: due to the massive number of characters included
240+
; in the UTF-16 characterset, some special non-terminal symbols have been added:
241+
;
242+
; utf-16-char → includes all UTF-16 characters.
243+
;
244+
; utf-16-char-no-whitespace → includes all UTF-16 characters except for spaces (\s), tabs (\t), newlines (\n)
245+
; and carriage returns (\r).
246+
;
247+
; utf-16-char-no-angle-or-colon → includes all UTF-16 characters except for the
248+
; angle-bracket characters (<>) and colon (:). However
249+
; those characters are valid if an uneven number of backslash
250+
; characters is located infront of them.
251+
252+
minimessage → string {tag string}
253+
254+
string → utf-16-char*
255+
256+
tag → "<" tag-name tag-arguments "/>"
257+
tag → "<" tag-name tag-arguments ">" minimessage {"</" tag-name ">"}
258+
259+
tag-name → identifier
260+
261+
tag-arguments → "" | named-argument " "+ sequential-argument | named-argument | " "* sequential-argument
262+
263+
named-argument → "" | " "+ {"!"} identifier {named-argument} | " "+ identifier "=" named-value {named-argument}
264+
265+
named-value → "" | quoted | no-whitespace-string
266+
267+
no-whitespace-string → utf-16-char-no-whitespace*
268+
269+
sequential-argument → ":" sequential-value {sequential-argument}
270+
271+
sequential-value → "" | quoted | sequential-string
272+
273+
sequential-string → utf-16-char-no-angle-or-colon*
274+
275+
quoted → "'" string "'" | """ string """
276+
277+
identifier → alphanumeric+
278+
279+
alphanumeric → "a" | "b" | "c" | "d"
280+
| "e" | "f" | "g" | "h"
281+
| "i" | "j" | "k" | "l"
282+
| "m" | "n" | "o" | "p"
283+
| "q" | "r" | "s" | "t"
284+
| "u" | "v" | "w" | "x"
285+
| "y" | "z" | "_" | "0"
286+
| "1" | "2" | "3" | "4"
287+
| "5" | "6" | "7" | "8"
288+
| "9"
289+
```
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
{
2+
"$schema": "https://raw.githubusercontent.com/martinring/tmlanguage/master/tmlanguage.json",
3+
"name": "bnf",
4+
"scopeName": "source.bnf",
5+
"patterns": [{ "include": "#comment" }, { "include": "#rule" }, { "include": "#meta" }, { "include": "#strings" }],
6+
"repository": {
7+
"comment": {
8+
"name": "comment.line.semicolon.bnf",
9+
"match": ";.*$"
10+
},
11+
12+
"rule": {
13+
"name": "meta.rule.bnf",
14+
"begin": "^(\\s*)([A-Za-z0-9_-]+)(\\s*→)",
15+
"beginCaptures": {
16+
"2": { "name": "entity.name.function.nonterminal.bnf" },
17+
"3": { "name": "keyword.reserved.arrow.bnf" }
18+
},
19+
"end": "(?=^\\s*[A-Za-z0-9_-]+\\s*→|\\Z)",
20+
"patterns": [
21+
{ "include": "#tripleQuotedNonterminal" },
22+
{ "include": "#strings" },
23+
{ "include": "#meta" },
24+
{
25+
"match": "\\b[A-Za-z0-9_-]+\\b",
26+
"name": "variable.language.nonterminal.bnf"
27+
}
28+
]
29+
},
30+
31+
"tripleQuotedNonterminal": {
32+
"name": "meta.triplequoted.nonterminal.bnf",
33+
"begin": "\"{3}",
34+
"beginCaptures": {
35+
"0": { "name": "string.quoted.double.bnf" }
36+
},
37+
"end": "\"{3}",
38+
"endCaptures": {
39+
"0": { "name": "string.quoted.double.bnf" }
40+
},
41+
"patterns": [
42+
{
43+
"match": "\\b[A-Za-z0-9_-]+\\b",
44+
"name": "variable.language.nonterminal.bnf"
45+
}
46+
]
47+
},
48+
49+
"strings": {
50+
"patterns": [
51+
{
52+
"name": "string.quoted.double.bnf",
53+
"begin": "\"",
54+
"end": "\"",
55+
"patterns": [{ "match": "\"\"", "name": "constant.character.escape.doublequote.bnf" }]
56+
},
57+
{
58+
"name": "string.quoted.single.bnf",
59+
"begin": "'",
60+
"end": "'"
61+
}
62+
]
63+
},
64+
65+
"meta": {
66+
"patterns": [
67+
{
68+
"match": "",
69+
"name": "keyword.reserved.arrow.bnf"
70+
},
71+
{
72+
"match": "\\|",
73+
"name": "keyword.reserved.choice.bnf"
74+
},
75+
{
76+
"match": "[{}()]",
77+
"name": "punctuation.section.group.bnf"
78+
}
79+
]
80+
}
81+
}
82+
}

0 commit comments

Comments
 (0)