|
| 1 | +# Interpreter |
| 2 | + |
| 3 | +## Description |
| 4 | + |
| 5 | +If a problem occurs very often and requires long and repetitive steps to solve |
| 6 | +it, then the problem instances might be expressed in a simple language and an |
| 7 | +interpreter object could solve it by interpreting the sentences written in this |
| 8 | +simple language. |
| 9 | + |
| 10 | +Basically, for any kind of problems we define: |
| 11 | + |
| 12 | +- a [domain specific language](https://en.wikipedia.org/wiki/Domain-specific_language), |
| 13 | +- a grammar for this language, |
| 14 | +- an interpreter that solves the problem instances. |
| 15 | + |
| 16 | +## Motivation |
| 17 | + |
| 18 | +Our goal is to translate simple mathematical expressions into postfix expressions |
| 19 | +(or [Reverse Polish notation](https://en.wikipedia.org/wiki/Reverse_Polish_notation)) |
| 20 | +For simplicity, our expressions consist of ten digits `0`, ..., `9` and two |
| 21 | +operations `+`, `-`. For example, the expression `2 + 4` is translated into |
| 22 | +`2 4 +`. |
| 23 | + |
| 24 | +## Context Free Grammar for our problem |
| 25 | + |
| 26 | +Our task is translate infix expressions into postfix ones. Let's define a context |
| 27 | +free grammar for a set of infix expressions over `0`, ..., `9`, `+`, and `-`, |
| 28 | +where: |
| 29 | + |
| 30 | +- terminal symbols: `0`, ..., `9`, `+`, `-` |
| 31 | +- non-terminal symbols: `exp`, `term` |
| 32 | +- start symbol is `exp` |
| 33 | +- and the following are production rules |
| 34 | + |
| 35 | +```ignore |
| 36 | +exp -> exp + term |
| 37 | +exp -> exp - term |
| 38 | +exp -> term |
| 39 | +term -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| 40 | +``` |
| 41 | + |
| 42 | +__NOTE:__ This grammar should be further transformed depending on what we are going |
| 43 | +to do with it. For example, we might need to remove left recursion. For more |
| 44 | +details please see [Compilers: Principles,Techniques, and Tools |
| 45 | +](https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools) |
| 46 | +(aka Dragon Book). |
| 47 | + |
| 48 | +## Solution |
| 49 | + |
| 50 | +We simply implement a recursive descent parser. For simplicity's sake, the code |
| 51 | +panics when an expression is syntactically wrong (for example `2-34` or `2+5-` |
| 52 | +are wrong according to the grammar definition). |
| 53 | + |
| 54 | +```rust |
| 55 | +pub struct Interpreter<'a> { |
| 56 | + it: std::str::Chars<'a>, |
| 57 | +} |
| 58 | + |
| 59 | +impl<'a> Interpreter<'a> { |
| 60 | + |
| 61 | + pub fn new(infix: &'a str) -> Self { |
| 62 | + Self { it: infix.chars() } |
| 63 | + } |
| 64 | + |
| 65 | + fn next_char(&mut self) -> Option<char> { |
| 66 | + self.it.next() |
| 67 | + } |
| 68 | + |
| 69 | + pub fn interpret(&mut self, out: &mut String) { |
| 70 | + self.term(out); |
| 71 | + |
| 72 | + while let Some(op) = self.next_char() { |
| 73 | + if op == '+' || op == '-' { |
| 74 | + self.term(out); |
| 75 | + out.push(op); |
| 76 | + } else { |
| 77 | + panic!("Unexpected symbol '{}'", op); |
| 78 | + } |
| 79 | + } |
| 80 | + } |
| 81 | + |
| 82 | + fn term(&mut self, out: &mut String) { |
| 83 | + match self.next_char() { |
| 84 | + Some(ch) if ch.is_digit(10) => out.push(ch), |
| 85 | + Some(ch) => panic!("Unexpected symbol '{}'", ch), |
| 86 | + None => panic!("Unexpected end of string"), |
| 87 | + } |
| 88 | + } |
| 89 | +} |
| 90 | + |
| 91 | +pub fn main() { |
| 92 | + let mut intr = Interpreter::new("2+3"); |
| 93 | + let mut postfix = String::new(); |
| 94 | + intr.interpret(&mut postfix); |
| 95 | + assert_eq!(postfix, "23+"); |
| 96 | + |
| 97 | + intr = Interpreter::new("1-2+3-4"); |
| 98 | + postfix.clear(); |
| 99 | + intr.interpret(&mut postfix); |
| 100 | + assert_eq!(postfix, "12-3+4-"); |
| 101 | +} |
| 102 | +``` |
| 103 | + |
| 104 | +## Discussion |
| 105 | + |
| 106 | +There may be a wrong perception that the Interpreter design pattern is about design |
| 107 | +grammars for formal languages and implementation of parsers for these grammars. |
| 108 | +In fact, this pattern is about expressing problem instances in a more specific |
| 109 | +way and implementing functions/classes/structs that solve these problem instances. |
| 110 | +Rust language has `macro_rules!` that allow to define special syntax and rules |
| 111 | +on how to expand this syntax into source code. |
| 112 | + |
| 113 | +In the following example we create a simple `macro_rules!` that computes |
| 114 | +[Euclidean length](https://en.wikipedia.org/wiki/Euclidean_distance) of `n` |
| 115 | +dimensional vectors. Writing `norm!(x,1,2)` might be easier to express and more |
| 116 | +efficient than packing `x,1,2` into a `Vec` and calling a function computing |
| 117 | +the length. |
| 118 | + |
| 119 | +```rust |
| 120 | +macro_rules! norm { |
| 121 | + ($($element:expr),*) => { |
| 122 | + { |
| 123 | + let mut n = 0.0; |
| 124 | + $( |
| 125 | + n += ($element as f64)*($element as f64); |
| 126 | + )* |
| 127 | + n.sqrt() |
| 128 | + } |
| 129 | + }; |
| 130 | +} |
| 131 | + |
| 132 | +fn main() { |
| 133 | + let x = -3f64; |
| 134 | + let y = 4f64; |
| 135 | + |
| 136 | + assert_eq!(3f64, norm!(x)); |
| 137 | + assert_eq!(5f64, norm!(x, y)); |
| 138 | + assert_eq!(0f64, norm!(0, 0, 0)); |
| 139 | + assert_eq!(1f64, norm!(0.5, -0.5, 0.5, -0.5)); |
| 140 | +} |
| 141 | +``` |
| 142 | + |
| 143 | +## See also |
| 144 | + |
| 145 | +- [Interpreter pattern](https://en.wikipedia.org/wiki/Interpreter_pattern) |
| 146 | +- [Context free grammar](https://en.wikipedia.org/wiki/Context-free_grammar) |
| 147 | +- [macro_rules!](https://doc.rust-lang.org/rust-by-example/macros.html) |
0 commit comments