- Introduction
- Quick Start (Ubuntu)
- Benchmark against
ocamlopt - User Guide
- Dependencies
- Feature Highlights
- Building from source
- Why is it called Oonta?
Oonta is a compiler front-end for a subset of the OCaml programming language: it generates LLVM intermediate representation (IR) from OCaml source code.
Oonta uses the JJIK parser generator and JLEK lexer generator to perform the parsing and lexing stages. For building the IR, Oonta does not depend on the LLVM API.
This project is still a work in progress, many OCaml features are not yet supported. For example, modules, classes and objects are not yet supported. Additionally, the runtime does not yet provide a garbage collection service. Checkout this section for a complete list of currently supported language features.
Note
One could argue that with the many OCaml key features currently missing, it is not appropriate to call it "a compiler for a subset of the OCaml language", since it is more akin to simpler dialects of the ML programming language family, such as SML or Caml. However, the implementation and syntax reference has always been OCaml. Moreover, the final goal is to support as much OCaml features possible.
Note
This project is part of the "Compiler Toys" project, originally meant as a learning exercise on Compilers.
Install Rust and LLVM (sudo apt install llvm), then:
cargo install oonta # installs the oonta command
wget https://github.com/fuad1502/oonta/releases/download/v0.2.0/liboonta_runtime-0.2.0-Linux.deb
sudo dpkg -i liboonta_runtime-0.2.0-Linux.deb # installs oonta runtime library
cat << EOF > main.ml
let rec factorial x = if x <= 1 then 1 else x * factorial (x - 1)
let () = print_int (factorial 5)
EOF
oonta --opt --exec main.ml
./main.out # prints `120`Two benchmarks are provided:
Merge sort (benchmark/merge_sort.ml):
type list =
| Empty
| Cat of (int * list)
let rec map f lst =
match lst with
| Empty -> ()
| Cat (x, l) ->
let () = f x in
map f l
let rec create_lst_aux n acc =
match n with
| 0 -> acc
| _ -> create_lst_aux (n - 1) (Cat (read_int (), acc))
let create_lst n = create_lst_aux n Empty
let rec split_aux lst acc_left acc_right =
match lst with
| Empty -> (acc_left, acc_right)
| Cat (head, rest) -> split_aux rest acc_right (Cat (head, acc_left))
let split lst = split_aux lst Empty Empty
let rec merge left right =
match (left, right) with
| Empty, right -> right
| lest, Empty -> lest
| Cat (left_head, left_rest), Cat (right_head, right_rest) ->
if left_head <= right_head then
Cat (left_head, merge left_rest right)
else
Cat (right_head, merge left right_rest)
let rec merge_sort lst =
match lst with
| Empty -> lst
| Cat (_, Empty) -> lst
| _ ->
let splitted_lst = split lst in
let left =
match splitted_lst with
| lst, _ -> lst
in
let right =
match splitted_lst with
| _, lst -> lst
in
merge (merge_sort left) (merge_sort right)
let n = read_int ()
let lst = create_lst n
let sorted_lst = merge_sort lst
let () = map print_int sorted_lstInsertion sort (benchmark/insertion_sort.ml):
type list =
| Empty
| Cat of (int * list)
let rec map f lst =
match lst with
| Empty -> ()
| Cat (x, l) ->
let () = f x in
map f l
let rec create_lst_aux n acc =
match n with
| 0 -> acc
| _ -> create_lst_aux (n - 1) (Cat (read_int (), acc))
let create_lst n = create_lst_aux n Empty
let rec insert elem lst =
match lst with
| Empty -> Cat (elem, lst)
| Cat (head, tail) ->
if elem <= head then
Cat (elem, lst)
else
Cat (head, insert elem tail)
let rec insertion_sort lst =
match lst with
| Empty -> lst
| Cat (head, tail) -> insert head (insertion_sort tail)
let n = read_int ()
let lst = create_lst n
let sorted_lst = insertion_sort lst
let () = map print_int sorted_lstAnd polymorphic list comparison (benchmark/polymorphic_compare.ml):
type 'a list =
| Empty
| Cat of ('a * 'a list)
let rec create_lst_aux n acc =
match n with
| 0 -> acc
| _ -> create_lst_aux (n - 1) (Cat (read_int (), acc))
let create_lst n = create_lst_aux n Empty
let compare lst_a lst_b =
if lst_a > lst_b then
print_int 1
else if lst_a = lst_b then
print_int 0
else
print_int 2
let n = read_int ()
let lst = create_lst n
let () = compare lst lstExecute the following command inside the repository root folder to run the benchmarks.
python3 benchmark/benchmark.py 1000000 0 # run the merge_sort.ml benchmark on a list with 1 million elements
python3 benchmark/benchmark.py 10000 1 # run the insertion_sort.ml benchmark on a list with 10000 elements
python3 benchmark/benchmark.py 1000000 2 # run the polymorphic_compare.ml benchmark on a list with 10000 elementsWarning
Currently, for running the merge sort benchmark on large inputs, increase the
stack size limit with ulimit -s unlimited
On my Ubuntu machine (AMD Ryzenβ’ 7 7700X Γ 16), with ocamlopt version 5.4.0
and LLVM version 20.1.8, the result of running the above benchmarks are as
follows:
Benchmarking: merge_sort.ml
Elapsed time (./benchmark/ocamlopt.out): 0.8747 seconds
Elapsed time (./benchmark/oonta.out): 0.7445 seconds
> 14.88% faster
Benchmarking: insertion_sort.ml
Elapsed time (./benchmark/ocamlopt.out): 0.1325 seconds
Elapsed time (./benchmark/oonta.out): 0.3481 seconds
> 262.68% slower
Benchmarking: polymorphic_compare.ml
Elapsed time (./benchmark/ocamlopt.out): 0.1026 seconds
Elapsed time (./benchmark/oonta.out): 0.0635 seconds
> 38.14% faster
oonta --helpWith no command line options given, the oonta command will simply generates
LLVM IR without requiring any runtime dependencies. However, it provides
--opt, -O, --compile / -c and --exec / -e options to optimize the
generated IR, compile the generated IR to an object code and executable,
respectively, which requires certain dependencies to function. Internally,
oonta will invoke the following commands when given those options:
# with --opt
opt -S -O3 -o <.ll file> <.ll file>
# with --compile
llc -O3 -relocation-model=pic --filetype=obj -o <output> <.ll file>
# with --exec
clang -o <output> <.o file> -loonta_runtimeOn Ubuntu, install the llvm package to make opt, llc, and clang
available.
sudo apt install llvmWarning
The generated IR uses opaque
pointers. If the LLVM version is
older than version 15, the --opt / --compile / --exec options might not
work. If you're on Debian/Ubuntu, see
https://apt.llvm.org/ for instructions on installing
other versions.
Additionally, with --exec, you need the Oonta runtime library:
liboonta_runtime.a. If you're running an x64 linux machine, you can obtain
the static library from the release
page. If not, you would need to
build the runtime from source by following the guide
below.
Note
Currently the runtime only provides memory allocation requests using bump allocation. Garbage collection service is not yet available.
Note
This list is by no means complete. Unsupported features were intentionally not fully expanded for brevity.
| Feature | Supported? |
|---|---|
| Basic data types | Integers and booleans |
| Type inference | Yes |
| Global definitions | Yes |
| Local definitions | Yes |
| First class functions | Yes |
| Closures | Yes |
| Partial applications | Yes |
| Recursive definitions | Yes |
| Parametric variants | Yes |
| Polymorphic functions | Yes |
| Anonymous functions | Yes |
| Conditionals | Yes |
| Pattern matching | Yes. However, no exhaustiveness check. |
| Mutually recursive definitions | No |
| Records | No |
| Imperative features | No |
| Exceptions | No |
| Lazy expressions | No |
| Module system | No |
| Classes and objects | No |
| Labeled arguments | No |
| Polymorphic variants | No |
Oonta handles (parameteric) polymorphic function through compile-time monomorphization. It therefore generates more optimized code, at the cost of code bloat. This also allows removing the need for pointer tag.
For example, for the following OCaml code:
let apply_on_greater f x y =
if x > y then
f x
else
f y
let () = apply_on_greater print_int 3 2It will monomorphize apply_on_greater with type (('a -> 'b) -> 'a -> 'a -> 'b) into ((int -> unit) -> int -> int -> unit) and generate the following
LLVM IR:
define ccc void @oonta.apply_on_greater.int.unit.fun(ptr %f, i64 %x, i64 %y, ptr %env) {
entry:
%r = icmp sgt i64 %x, %y
br i1 %r, label %then, label %else
then:
%r0 = load ptr, ptr %f
call ccc void %r0(i64 %x, ptr %f)
br label %follow
else:
%r1 = load ptr, ptr %f
call ccc void %r1(i64 %y, ptr %f)
br label %follow
follow:
ret void
}Use the --verbose / -v option to debug each compile phase.
cat << EOF > main.ml
let apply_on_greater f x y =
if x > y then
f x
else
f y
let () = apply_on_greater print_int 3 2
EOF
oonta --exec --opt -v main.ml=> Lexing & Parsing Start
=> Lexing & Parsing End (0 ms)
=> Build AST Start
apply_on_greater =
FunExpr
βββΈ parameters: [f, x, y]
βββΈ captures: []
βββΈ recursive: no
βββΈ body:
CondExpr
βββΈ condition:
β BinOpExpr
β βββΈ operator: >
β βββΈ lhs:
β β VarExpr ("x")
β βββΈ rhs:
β VarExpr ("y")
βββΈ then expr:
β ApplicationExpr
β βββΈ function:
β β VarExpr ("f")
β βββΈ binds:
β βββΈ (0)
β VarExpr ("x")
βββΈ else expr:
ApplicationExpr
βββΈ function:
β VarExpr ("f")
βββΈ binds:
βββΈ (0)
VarExpr ("y")
() =
ApplicationExpr
βββΈ function:
β VarExpr ("apply_on_greater")
βββΈ binds:
βββΈ (0)
β VarExpr ("print_int")
βββΈ (1)
β LiteralExpr (3)
βββΈ (2)
LiteralExpr (2)
=> Build AST End (0 ms)
=> Infer types Start
Top level bindings:
apply_on_greater: (('a -> 'b) -> 'a -> 'a -> 'b)
=> Infer types End (0 ms)
=> Currying expressions Start
=> Currying expressions End (0 ms)
=> Monomorphization Start
Monomorphing 'apply_on_greater' (('a -> 'b) -> 'a -> 'a -> 'b) into ((int -> unit) -> int -> int -> unit):
'a -> int
'b -> unit
=> Monomorphization End (0 ms)
=> Build LLVM module Start
=> Build LLVM module End (0 ms)
=> Write LLVM module Start
=> Write LLVM module End (0 ms)
=> Optimize LLVM IR Start
=> Optimize LLVM IR End (12 ms)
=> LLVM backend Start
=> LLVM backend End (37 ms)
Line 1|let x = foo 3
^--
Error: cannot infer expression type: Unbound value foo
Line 1|let rec f x = f
^
Error: cannot infer expression type: Cannot unify 'b with ('a -> 'b)
Line 1|let () = 1 + 2
^----
Error: cannot bind expression of type int to ()
- Install C++ build dependencies
sudo apt install build-essential cmake
- Install
cargotool
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh- Clone repository
git clone https://github.com/fuad1502/oonta.git- Build
liboonta_runtime.a
cmake -S runtime -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build
- Install
liboonta_runtime.a
sudo cmake --install build
- Build
oonta
cargo build
cargo testOonta, is based on the Indonesian word unta, which translates to "camel".
