diff --git a/hugo/content/docs/reference/grammar-language.md b/hugo/content/docs/reference/grammar-language/_index.md similarity index 97% rename from hugo/content/docs/reference/grammar-language.md rename to hugo/content/docs/reference/grammar-language/_index.md index 03f2c3c3..453688b0 100644 --- a/hugo/content/docs/reference/grammar-language.md +++ b/hugo/content/docs/reference/grammar-language/_index.md @@ -370,6 +370,21 @@ Element: The parser will always exclude alternatives whose guard conditions evaluate to `false`. All other alternatives remain possible options for the parser to choose from. +### Infix operators + +[Infix operators](/docs/reference/grammar-language/infix-operators) can be defined using the `infix` keyword. This new syntax allows to define operators with different precedence levels and associativity in a more readable way. For more information on infix operators, please refer to the [dedicated documentation page](/docs/reference/grammar-language/infix-operators/syntactical-implementation). Be aware that the `infix` notation is syntactic sugar. You can do it also the [manual way using parser rules](/docs/reference/grammar-language/infix-operators/manual-implementation). + +```langium +Expression: BinaryExpr; + +infix BinaryExpr on PrimaryExpression: + right assoc '^' + > '*' | '/' + > '+' | '-' + ; + +PrimaryExpression: '(' expr=Expression ')' | value=Number; +``` ### More Examples diff --git a/hugo/content/docs/reference/grammar-language/infix-operators/_index.md b/hugo/content/docs/reference/grammar-language/infix-operators/_index.md new file mode 100644 index 00000000..cf3b0130 --- /dev/null +++ b/hugo/content/docs/reference/grammar-language/infix-operators/_index.md @@ -0,0 +1,49 @@ +--- +title: "Infix Operators" +weight: 125 +--- + +Infix operators are binary operators - they require two operands and an operator in between them. For example, in the expression `a + b`, `+` is an infix operator with operands `a` and `b`. + +## Grammar Ambiguities + +With infix operator we are confronted with grammar ambiguities. + +Ambiguities are a problem for the parsing process, because the parser needs to be able to determine the structure of an expression unambiguously. + +For example, consider these expressions: + +```plain +a + b * c +a - b - c +``` + +### Ambiguity between different operators + +Without additional rules, the first expression can be interpreted in two different ways: + +1. As `(a + b) * c`, where addition is performed first, followed by multiplication. +2. As `a + (b * c)`, where multiplication is performed first, followed by addition. + +Solution: The parser needs to know the **precedence** of the operators. In this case, multiplication has higher precedence than addition, so the correct interpretation is `a + (b * c)`. A precedence of an operator is higher, lower or equal to another operator's precedence. + +### Ambiguity between same operators + +The second expression can also be interpreted in two different ways: + +1. As `(a - b) - c`, where the first subtraction is performed first. +2. As `a - (b - c)`, where the second subtraction is performed first. + +Solution: The parser needs to know the **associativity** of the operator. In this case, subtraction is left-associative, so the correct interpretation is `(a - b) - c`. An operator can be left-associative, right-associative or non-associative. A way to remember what is what: +Imagine a operand between two same operators (like `... - b - ...`), which operator is executed first? If the left one is executed first, the operator is left-associative. If the right one is executed first, the operator is right-associative. If neither can be executed first (like in `a < b < c`), the operator is non-associative. + +Operator candidates for right-associativity are usually assignment operators (e.g., `=`) and exponentiation operators (e.g., `^`). + +## Implementation + +In order to embed precedence and associativity rules into the parsing process, we can use a technique called **precedence climbing** (also known as operator-precedence parsing). Precedence climbing is a recursive parsing technique that allows us to handle infix operators with different precedences and associativities in a straightforward manner. + +Independent of the used parser generator or language framework, this technique can be often implemented using grammar rules. In Langium, you have two options to implement precedence climbing: + +* using [non-terminal rules](/docs/reference/grammar-language/infix-operators/manual-implementation) +* using the [infix syntax](/docs/reference/grammar-language/infix-operators/syntactical-implementation) diff --git a/hugo/content/docs/reference/grammar-language/infix-operators/manual-implementation.md b/hugo/content/docs/reference/grammar-language/infix-operators/manual-implementation.md new file mode 100644 index 00000000..a08e7dfe --- /dev/null +++ b/hugo/content/docs/reference/grammar-language/infix-operators/manual-implementation.md @@ -0,0 +1,70 @@ +--- +title: "Using non-terminal rules" +weight: 100 +--- + +Infix operators can also be implemented using non-terminal rules. This approach provides more flexibility and allows for better handling of operator precedence and associativity. + +## Operator table + +Let's implement a simple expression grammar with the following operators: + +| Operator | Precedence | Associativity | +|----------|------------|---------------| +| `+` | 1 | Left | +| `-` | 1 | Left | +| `*` | 2 | Left | +| `/` | 2 | Left | +| `^` | 3 | Right | + +## Grammar rules + +```langium +Expression: + Addition; + +Addition infers Expression: + Multiplication ({infer BinaryExpression.left=current} operator=('+' | '-') right=Multiplication)*; + +Multiplication infers Expression: + Exponentiation ({infer BinaryExpression.left=current} operator=('*' | '/') right=Exponentiation)*; + +Exponentiation infers Expression: + Primary ({infer BinaryExpression.left=current} operator='^' right=Exponentiation)?; + +Primary infers Expression: + '(' Expression ')' + | ... //prefix, literals, identifiers + ; +``` + +## Explanation + +### Precedence + +Precedence is handled by creating a hierarchy of non-terminal rules. Each rule corresponds to a level of precedence, with higher-precedence operators being defined in rules that are called by lower-precedence rules. For example, `Addition` calls `Multiplication`, which in turn calls `Exponentiation`. This ensures that multiplication and division are evaluated before addition and subtraction, and exponentiation is evaluated before both. + +### Associativity + +Associativity is handled by three patterns. They make sure that operators are grouped correctly based on their associativity. + +#### Pattern 1 (Left Associative): + +```langium +Current infers Expression: + Next ({infer BinaryExpression.left=current} operator=('op1' | 'op2' | ...) right=Next)*; +``` + +#### Pattern 2 (Right Associative): + +```langium +Current infers Expression: + Next ({infer BinaryExpression.left=current} operator=('op1'| 'op2' | ...) right=Current)?; +``` + +#### Pattern 3 (Non-Associative): + +```langium +Current infers Expression: + Next ({infer BinaryExpression.left=current} operator=('op1' | 'op2' | ...) right=Next)?; +``` diff --git a/hugo/content/docs/reference/grammar-language/infix-operators/syntactical-implementation.md b/hugo/content/docs/reference/grammar-language/infix-operators/syntactical-implementation.md new file mode 100644 index 00000000..e0c00558 --- /dev/null +++ b/hugo/content/docs/reference/grammar-language/infix-operators/syntactical-implementation.md @@ -0,0 +1,69 @@ +--- +title: "Using the infix syntax" +weight: 200 +--- + +## Operator table + +Let's implement a simple expression grammar with the following operators: + +| Operator | Precedence | Associativity | +|----------|------------|---------------| +| `+` | 1 | Left | +| `-` | 1 | Left | +| `*` | 2 | Left | +| `/` | 2 | Left | +| `^` | 3 | Right | + +## Grammar rules + +```langium +Expression: BinaryExpr; + +infix BinaryExpr on PrimaryExpression: + right assoc '^' + > '*' | '/' + > '+' | '-' + ; + +PrimaryExpression: '(' expr=Expression ')' | value=Number; +``` + +In addition to better readability, the new notation also makes use of **performance optimizations** to speed up expression parsing by roughly 50% compared to the typical way of writing expressions. + +### Primary expression + +The `PrimaryExpression` rule defines the basic building blocks of our expressions, which can be (for example) a parenthesized expression, an unary expression, or a number literal. + +```langium +Expression: BinaryExpr; + +infix BinaryExpr on PrimaryExpression: + ... + ; + +PrimaryExpression: '(' expr=Expression ')' | '-' value=PrimaryExpression | value=Number | ...; +``` + +### Precedence + +Use the `>` operator to define precedence levels. Operators listed after a `>` have lower precedence than those before it. In the example above, `^` has the highest precedence, followed by `*` and `/`, and finally `+` and `-` with the lowest precedence. + +```langium +infix BinaryExpr on ...: + A > B > C + //A has higher precedence than B, B higher than C + ; +``` + +If you have multiple operators with the same precedence, list them on the same line separated by `|`. + +### Associativity + +The default associativity for infix operators is left associative. To specify right associativity, use the `assoc` keyword preceded by `right` before the operator. + +```langium +infix BinaryExpr on ...: + ...> right assoc '^' >... + ; +``` diff --git a/hugo/static/prism/langium-prism.js b/hugo/static/prism/langium-prism.js index f174e169..ea3ff729 100644 --- a/hugo/static/prism/langium-prism.js +++ b/hugo/static/prism/langium-prism.js @@ -18,7 +18,7 @@ Prism.languages.langium = { greedy: true }, keyword: { - pattern: /\b(interface|fragment|terminal|boolean|current|extends|grammar|returns|bigint|hidden|import|infers|number|string|entry|false|infer|Date|true|type|with)\b/ + pattern: /\b(interface|fragment|terminal|boolean|current|extends|grammar|returns|bigint|hidden|import|infers|number|string|entry|false|infer|Date|true|type|with|on|infix|left|right|assoc)\b/ }, property: { pattern: /\b[a-z][\w]*(?==|\?=|\+=|\??:|>)\b/