|
| 1 | +--- |
| 2 | +title: "Lisp Compiler Optimizations" |
| 3 | +date: "2024-05-30" |
| 4 | +tags: ["rust"] |
| 5 | +description: "Smaller programs that do less work." |
| 6 | +--- |
| 7 | + |
| 8 | +I recently added some optimizations to [my compiler](https://healeycodes.com/lisp-to-javascript-compiler) that turns Lisp into JavaScript. |
| 9 | + |
| 10 | +The features I added are constant folding and propagation, and dead-code elimination, which work together to produce smaller programs that do less work. |
| 11 | + |
| 12 | +I chose these features by reading the wiki page for [optimizing compiler](https://en.wikipedia.org/wiki/Optimizing_compiler) and picking a few that I thought I could implement in a night or two. Excluding tests, this required adding [~200 lines of additional Rust code](https://github.com/healeycodes/lisp-to-js) to my compiler. |
| 13 | + |
| 14 | +[Constant propagation](https://en.wikipedia.org/wiki/Constant_folding) involves removing variable bindings that have a known result at compile-time, and replacing the variable references with literal values: |
| 15 | + |
| 16 | +```lisp |
| 17 | +; before |
| 18 | +(let ((a 1)) a) |
| 19 | +
|
| 20 | +; after |
| 21 | +(let () 1) |
| 22 | +``` |
| 23 | + |
| 24 | +Inside the _let_ expression body, the variable `a` has been replaced with the literal value `1`. |
| 25 | + |
| 26 | +[Constant folding](https://en.wikipedia.org/wiki/Constant_folding) simplifies expressions that have a known result at compile-time. Below, a group of arithmetic expressions is replaced by a literal value: |
| 27 | + |
| 28 | +```lisp |
| 29 | +; before |
| 30 | +(let ((b 2) (c 3)) |
| 31 | + (print |
| 32 | + (+ |
| 33 | + (+ b 4 c) |
| 34 | + (- b c 7) |
| 35 | + ))) |
| 36 | + |
| 37 | +; after |
| 38 | +(let () (print 1)) |
| 39 | +``` |
| 40 | + |
| 41 | +This simplification wouldn't be possible without performing constant propagation first (`b` and `c` need to be resolved). It's common for different types of compiler optimizations to stack and complement each other like this. |
| 42 | + |
| 43 | +Dead code elimination involves removing code that has no effect on the program's output. For example, when an if-expression's check is known at compile-time, the unused branch (and the check) can be removed entirely: |
| 44 | + |
| 45 | +```lisp |
| 46 | +; before |
| 47 | +(lambda () |
| 48 | + (if (< 1 2) 5 6) |
| 49 | +) |
| 50 | +
|
| 51 | +; after |
| 52 | +(lambda () |
| 53 | + 5 |
| 54 | +) |
| 55 | +``` |
| 56 | + |
| 57 | +Why do all this? Well, simpler expressions require less run-time operations which makes optimized code run faster. When dead code is removed, the size of the generated JavaScript is smaller. For browsers, this means the script can start executing sooner (due to a smaller download). For servers, this allows a faster start up time because less code needs to be parsed and executed. |
| 58 | + |
| 59 | +## Transforming Code |
| 60 | + |
| 61 | +These optimizations are applied after parsing but before code generation. When adding each optimization, I didn't have to alter the existing code generation logic because optimization is a step that transforms an abstract syntax tree (AST) into a new AST. |
| 62 | + |
| 63 | +The example optimizations in the previous section showed the Lisp source code being altered, rather than the generated JavaScript, because that's how it seems from the compiler's point of view — it's like a more efficient program was passed to the code generation step. |
| 64 | + |
| 65 | +In the compiler: |
| 66 | + |
| 67 | +```rust |
| 68 | +// before |
| 69 | +let expressions = parse(input); // Lisp code -> AST |
| 70 | +println!("{}", compile(expressions)); // AST -> JavaScript |
| 71 | + |
| 72 | +// after |
| 73 | +let expressions = parse(input); |
| 74 | +let optimized = optimize(expressions); // AST -> AST |
| 75 | +println!("{}", compile(optimized)); |
| 76 | +``` |
| 77 | + |
| 78 | +Let's dig into the `optimize` function here. Since optimizations can stack (think of a deeply nested arithmetic expression having multiple “fold events” as it shrinks down a single value), we need to start applying optimizations at the bottom of the AST and then work our way back up. |
| 79 | + |
| 80 | +Take for example, the program `(+ (+ 1 2) (- 3 4))`. The inner expressions must be optimized before the outer expression can be optimized. The two inner expressions are the two bottom nodes of the AST. |
| 81 | + |
| 82 | +The `optimize` function performs a post-order traversal of the AST (similar to a depth-first search) as each expression is folded into a literal value. |
| 83 | + |
| 84 | + |
| 85 | + |
| 86 | +## For Each Expression |
| 87 | + |
| 88 | +A Lisp program is a list of expressions. The optimization step of my compiler iterates over each expression (and its sub-expressions) and optimizes from the bottom up. |
| 89 | + |
| 90 | +```rust |
| 91 | +// Given an AST, attempt to create a more optimized AST |
| 92 | +fn optimize(program: Vec<Expression>) -> Vec<Expression> { |
| 93 | + return program |
| 94 | + .into_iter() |
| 95 | + .map(|expr| optimize_expression(expr, &mut HashMap::new())) |
| 96 | + .collect(); |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +Similar to the `compile` function, described in [my previous post](https://healeycodes.com/lisp-to-javascript-compiler), the `optimize_expression` function here is a long match statement, that recursively calls itself until it reaches the bottom of the AST. The calls then unroll upwards, allowing the root expression to take advantage of already-optimized expressions. |
| 101 | + |
| 102 | +One of the more simple branches of this long match statement is the optimization of if-expressions: |
| 103 | + |
| 104 | +```rust |
| 105 | +fn optimize_expression( |
| 106 | + expression: Expression, |
| 107 | + context: &mut HashMap<String, Option<Expression>>, |
| 108 | +) -> Expression { |
| 109 | + match expression { |
| 110 | + |
| 111 | + // The best case with if-expressions is to be able to |
| 112 | + // remove the check (and unused branch) entirely, and to place |
| 113 | + // the winning branch into the if-expression's position in the AST |
| 114 | + Expression::IfExpression(if_expr) => { |
| 115 | + |
| 116 | + // Ensure the check expression is optimized |
| 117 | + let check_expr = optimize_expression(if_expr.check, context); |
| 118 | + match check_expr { |
| 119 | + Expression::Atom(ref atom) => match atom { |
| 120 | + |
| 121 | + // We can only remove dead code when the check can be |
| 122 | + // folded into a boolean value at compile-time |
| 123 | + Atom::Boolean(b) => { |
| 124 | + if *b { |
| 125 | + return optimize_expression(if_expr.r#true, context); |
| 126 | + } else { |
| 127 | + return optimize_expression(if_expr.r#false, context); |
| 128 | + } |
| 129 | + } |
| 130 | + _ => {} |
| 131 | + }, |
| 132 | + _ => {} |
| 133 | + } |
| 134 | + |
| 135 | + // The check expression couldn't be folded into a boolean |
| 136 | + // but the parts of the if-expression may be able to be |
| 137 | + // folded into a smaller expressions internally |
| 138 | + return Expression::IfExpression(Box::new(IfExpression { |
| 139 | + check: optimize_expression(check_expr, context), |
| 140 | + r#true: optimize_expression(if_expr.r#true, context), |
| 141 | + r#false: optimize_expression(if_expr.r#false, context), |
| 142 | + })); |
| 143 | + } |
| 144 | + |
| 145 | + // .. other branches |
| 146 | +``` |
| 147 | + |
| 148 | +Before we look at how the arithmetic expressions are optimized, I'll explain how the *context* argument of `optimize_expression` works. _Let_ expressions can optionally bind variables. These variables can be bound to literals as well as expression results. For example, we can define `a` to be `(+ 1 2)` and then double it and print it. |
| 149 | + |
| 150 | +```lisp |
| 151 | +(let ((a (+ 1 2))) |
| 152 | + (print (+ a a)) ; can be optimized to `(print 6)` |
| 153 | +) |
| 154 | +``` |
| 155 | + |
| 156 | +When we're in the middle of optimizing the sum expression, we need to know what `a` is — but when we parse the AST, it will just be the atom `a` which isn't very useful. |
| 157 | + |
| 158 | +The solution for this problem is to store a context object that stores the variable binding after the binding expression has been optimized. In the above example, the context object contains `{a: 3}` during the optimization of the _let_ expression's body. |
| 159 | + |
| 160 | +Let's look at how this happens inside the `optimize_expression` match arm for _let_ expressions. |
| 161 | + |
| 162 | +```rust |
| 163 | +// Note: bindings can be reduced to an empty list |
| 164 | +// if they all optimize into literals, for example: |
| 165 | +// `(let ((a 1)) a)` -> `(let () 1)` |
| 166 | +Expression::LetExpression(let_expr) => { |
| 167 | + let mut optimized_bindings: Vec<Binding> = vec![]; |
| 168 | + let_expr.bindings.into_iter().for_each(|binding| { |
| 169 | + let binding_expr = optimize_expression(binding.expression, context); |
| 170 | + |
| 171 | + // When the expression we're about to bind is an atom, |
| 172 | + // we can get rid of the binding and replace instances |
| 173 | + // of this variable with the literal value |
| 174 | + match binding_expr { |
| 175 | + Expression::Atom(ref atom) => match atom { |
| 176 | + |
| 177 | + // Insert literals, overwriting variables from any higher scopes. |
| 178 | + // Return before pushing the binding so it's removed from the AST |
| 179 | + Atom::Number(n) => { |
| 180 | + context |
| 181 | + .insert( |
| 182 | + binding.symbol, |
| 183 | + Some(Expression::Atom(Atom::Number(*n))) |
| 184 | + ); |
| 185 | + return; |
| 186 | + } |
| 187 | + Atom::Boolean(b) => { |
| 188 | + context |
| 189 | + .insert( |
| 190 | + binding.symbol, |
| 191 | + Some(Expression::Atom(Atom::Boolean(*b))) |
| 192 | + ); |
| 193 | + return; |
| 194 | + } |
| 195 | + |
| 196 | + // No need to overwrite symbols that refer to already-tracked |
| 197 | + // and potentially already-optimized values |
| 198 | + Atom::Symbol(s) => match context.get(s) { |
| 199 | + Some(_) => return, |
| 200 | + None => {} |
| 201 | + }, |
| 202 | + _ => {} |
| 203 | + }, |
| 204 | + _ => {} |
| 205 | + } |
| 206 | + |
| 207 | + // This binding can't be removed but may have been optimized internally |
| 208 | + optimized_bindings.push(Binding { |
| 209 | + symbol: binding.symbol, |
| 210 | + expression: binding_expr, |
| 211 | + }) |
| 212 | + }); |
| 213 | + |
| 214 | + return Expression::LetExpression(LetExpression { |
| 215 | + bindings: optimized_bindings, |
| 216 | + |
| 217 | + // The let body will be optimized in this sub-call |
| 218 | + expressions: let_expr |
| 219 | + .expressions |
| 220 | + .into_iter() |
| 221 | + .map(|expr| optimize_expression(expr, context)) |
| 222 | + .collect(), |
| 223 | + }); |
| 224 | +} |
| 225 | +``` |
| 226 | + |
| 227 | +Sum expressions can be folded when all the items are either number literals (or can be folded into number literals). Difference expressions can also be folded if the same invariants hold. Even when these two types of expressions can't be shrunk into atoms, they can still be partially folded. For instance, `(+ 1 a 1)` is the same as `(+ a 2)` or `(+ 2 a)`. |
| 228 | + |
| 229 | +```rust |
| 230 | +// `nums` is a Vec<f64> that's built by optimizing sub-expressions |
| 231 | +// and collecting any number literals |
| 232 | + |
| 233 | +Op::Plus => { |
| 234 | + |
| 235 | + // Best case: no expressions after optimization, return atom! |
| 236 | + if optimized_exprs_without_numbers.len() == 0 { |
| 237 | + return Expression::Atom(Atom::Number(nums.iter().sum())); |
| 238 | + } |
| 239 | + |
| 240 | + // Sum any literals, may reduce add-operations produced at code generation |
| 241 | + optimized_exprs_without_numbers |
| 242 | + .push(Expression::Atom(Atom::Number(nums.iter().sum()))); |
| 243 | + return Expression::ArithmeticExpression(Box::new(ArithmeticExpression { |
| 244 | + op: arth_expr.op, |
| 245 | + expressions: optimized_exprs_without_numbers, |
| 246 | + })); |
| 247 | +} |
| 248 | +``` |
| 249 | + |
| 250 | +Less-than and greater-than expressions only accept two arguments, so they can be folded into `true` or `false` values when both arguments are known at compile-time. |
| 251 | + |
| 252 | +## Tests |
| 253 | + |
| 254 | +For this project, I needed tests to ensure minor parser, optimization, or code generation tweaks didn't break something unknown. But the tests also needed to be very copy-and-pastable. |
| 255 | + |
| 256 | +One thing I found productive was to assert on the debug string of the AST result: |
| 257 | + |
| 258 | +```rust |
| 259 | +#[test] |
| 260 | +fn test_optimize_sub() { |
| 261 | + assert_eq!( |
| 262 | + format!("{:?}", optimize(program().parse(b"(- 1 2)").unwrap())), |
| 263 | + "[Atom(Number(-1.0))]" |
| 264 | + ); |
| 265 | +} |
| 266 | +``` |
| 267 | + |
| 268 | +I'm putting off writing an end-to-end test suite where I compare the result of the code generated JavaScript to my handwritten JavaScript. I would write a Lisp program and then the matching JavaScript program, and the test would assert a matching stdout. Maybe I can use the [v8 crate](https://crates.io/crates/v8)? It's probably quicker to use Bash and Node.js. |
| 269 | + |
| 270 | +If I add any more features to this compiler, I'll probably write a quick test framework first. |
0 commit comments