Skip to content

Commit 8ae1b8c

Browse files
committed
Update README
1 parent 335354a commit 8ae1b8c

File tree

1 file changed

+85
-22
lines changed

1 file changed

+85
-22
lines changed

README.md

Lines changed: 85 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -43,38 +43,76 @@ make unit
4343
- [x] Lexer
4444
- [x] Parser
4545
- [x] Evaluator
46-
- [ ] Compiler
46+
- [x] Compiler
4747

4848
## Features
4949

5050
| Feature | Interpreter | Compiler |
5151
|------------------------|:-----------:|:--------:|
52-
| Bindings || |
53-
| Conditionals || |
54-
| Strings || |
55-
| Integers || |
56-
| Arithmetic +-/* || |
57-
| Arrays || |
58-
| Hashes || |
59-
| Functions || |
60-
| First class functions || |
61-
| Higher order functions || |
62-
| Closures || |
63-
| Recursion || |
64-
| Built-In Functions || |
52+
| Bindings || |
53+
| Conditionals || |
54+
| Strings || |
55+
| Integers || |
56+
| Arithmetic +-/* || |
57+
| Arrays || |
58+
| Hashes || |
59+
| Functions || |
60+
| First class functions || |
61+
| Higher order functions || |
62+
| Closures || |
63+
| Recursion || |
64+
| Built-In Functions || |
6565
| Macros |||
6666

6767
## Additional Features (not present in the original implementation)
6868
| Feature | Interpreter | Compiler |
6969
|------------------------|:-----------:|:--------:|
70-
| Floats |||
71-
| Float Arithmetic |||
72-
| String Indexing |||
73-
| String Concatenation |||
74-
| String Indexing |||
75-
| String Equality |||
76-
| Negative Indexing |||
77-
| Comments |||
70+
| Floats |||
71+
| Float Arithmetic |||
72+
| String Indexing |||
73+
| String Concatenation |||
74+
| String Equality |||
75+
| Negative Indexing |||
76+
| Comments |||
77+
78+
## Compiler & VM Implementation Notes
79+
80+
The compiler and VM follow "Writing a Compiler in Go" by Thorsten Ball, but with some OCaml-specific considerations.
81+
82+
### Why the code looks "imperative"
83+
84+
While OCaml excels at functional programming, the VM implementation uses mutation in a few places for performance:
85+
86+
- **Compiler**: Uses `Dynarray` (dynamic arrays) and `Buffer` for building bytecode, and mutable fields for tracking compilation state
87+
- **VM**: Uses mutable arrays for the stack, globals, and call frames
88+
89+
This is intentional. A purely functional VM with immutable data structures would allocate more and be slower. The mutation is localized and doesn't leak into the rest of the codebase.
90+
91+
### Performance
92+
93+
Recursive Fibonacci benchmark (`fib(35)`):
94+
95+
| Implementation | Go | OCaml |
96+
|----------------|------|-------|
97+
| Tree-walking interpreter | ~8.8s | ~4.9s |
98+
| Bytecode VM | ~2.8s | ~2.0s |
99+
100+
OCaml's tree-walker is ~1.8x faster than Go's thanks to efficient pattern matching and algebraic data types. The bytecode VM is ~29% faster than Go after optimization:
101+
102+
1. **`[@inline]` hints on hot functions** - The biggest win. OCaml's compiler is conservative about inlining; Go is more aggressive. Adding `[@inline]` to `push`, `pop`, `current_frame`, `execute_binary_op`, etc. gave ~16% speedup.
103+
104+
2. **`Bytes.unsafe_get` / `Array.unsafe_get`** - Skip bounds checking in the VM loop. Safe because we trust our own compiler's bytecode. ~3% speedup.
105+
106+
3. **`Obj.magic` for opcode dispatch** - Convert int to opcode variant without pattern matching through `of_int`. OCaml represents simple variants as integers internally, so this is safe. Preserves exhaustiveness checking in the main dispatch.
107+
108+
### Why OCaml's tree-walker is so fast
109+
110+
OCaml's interpreter is ~1.8x faster than Go's because:
111+
- Pattern matching compiles to efficient jump tables
112+
- Algebraic data types have no runtime type assertions
113+
- The GC is optimized for functional allocation patterns
114+
115+
This means the VM has less relative speedup over the interpreter compared to Go, but in absolute terms both implementations are fast.
78116

79117
## TODO
80118

@@ -99,3 +137,28 @@ Monkey supports closures and first class functions. It would be interesting to a
99137
- [ ] pattern matching
100138

101139
We those in place, it would be a JS-looking language with a ML core 🤔
140+
141+
### Type Safety
142+
143+
Currently, out-of-bounds array access and missing hash keys return `null` (matching canonical Monkey). This is a footgun in dynamic languages.
144+
145+
When adding a static type system, consider:
146+
147+
1. **Option types**: `arr[i]` returns `Option<T>`, forces explicit `match`/`unwrap`
148+
2. **Dependent types**: Prove bounds at compile time (e.g., `Vec<T, N>` where index must be `< N`)
149+
3. **Refinement types**: `arr[i]` where `i : { n : Int | 0 <= n < len(arr) }`
150+
4. **Gradual typing**: Allow both safe (`arr.get(i) -> Option`) and unsafe (`arr[i] -> T`) with different syntax
151+
152+
Interesting type system concepts to explore:
153+
- **Hindley-Milner** type inference (ML, Haskell)
154+
- **Bidirectional type checking** (modern approach, easier to implement)
155+
- **Algebraic data types** with exhaustiveness checking
156+
- **Row polymorphism** for extensible records/hashes
157+
- **Effect systems** for tracking errors, IO, etc.
158+
- **Linear/affine types** for resource management (Rust-style ownership)
159+
- **Dependent types** (Idris, Agda) - types that depend on values
160+
161+
Resources:
162+
- "Types and Programming Languages" (Pierce) - the bible
163+
- "Practical Foundations for Programming Languages" (Harper)
164+
- Bidirectional typing: https://arxiv.org/abs/1908.05839

0 commit comments

Comments
 (0)