Skip to content

Runtime

Lennart Augustsson edited this page Apr 11, 2026 · 5 revisions

The runtime system is written in C and can be found in src/runtime/eval.c. It uses combinators for handling variables, and has primitive operations for built in types and for executing IO operations. There is a also a simple mark-scan garbage collector. The runtime system is written in reasonably portable C code.

Runtime flags

Runtime flags are given between the flags +RTS and -RTS. Between those, the runtime decodes the flags, everything else is available to the running program.

  • -HSIZE set heap size to SIZE cells, can be suffixed by k, M, or G, default is 50M
  • -KSIZE set stack size to SIZE entries, can be suffixed by k, M, or G, default is100k
  • -rFILE read combinators from FILE, instead of out.comb
  • -v be more verbose, flag can be repeated
  • -T generate profiling stats (if compiled with -T as well)
  • -B ring the bell on every GC
  • -oFILE just read the input, run garbage collection, and write out the resulting graph to file.

For example, bin/mhseval +RTS -H1M -v -RTS hello runs out.comb and the program gets the argument hello, whereas the runtime system sets the heap to 1M cells and is verbose.

Types

There are some primitive data types, e.g Int, IO, Ptr, and Double. These are known by the runtime system and various primitive operations work on them. The function type, ->, is (of course) also built in.

All other types are defined with the language. They are converted to lambda terms using an encoding. For types with few constructors (< 5) it uses Scott encoding, otherwise it is a pair with an integer tag and a tuple (Scott encoded) with all arguments. The runtime system knows how lists and booleans are encoded.

Using GMP for Integer

The default implementation of the Integer type is written in Haskell and is quite slow. It is possible to use the GMP library instead. To use GMP you need to uncomment the first few lines in the Makefile, and also modify the definition that directs the C compiler where to find GMP.

NOTE To switch between using and not using GMP you need to do make clean. You might also need to do make USECPPHS=cpphs bootstrapcpphs if there are complaints.

FFI

MicroHs supports calling C functions. When running the program directly (using -r) or when generating a .comb file, only the functions in the table built into src/runtime/eval.c can be used. When generating a .c file or an executable, any C function can be called.

There is a lot of missing FFI functionality compared to GHC.

Serialization

The runtime system can serialize and deserialize any expression and keep its graph structure (sharing and cycles). The only exceptions to this are C pointers (e.g., file handles), which cannot be serialized (except for stdin, stdout, and stderr).

Memory layout

Memory allocation is based on cells. Each cell has room for two pointers (i.e., two words, typically 16 bytes), so it can represent an application node. One bit is used to indicate if the cell is an application or something else. If it is something else, one word is a tag indicating what it is, e.g., a combinator or an integer. The second word is then used to store any payload, e.g., the number itself for an integer node.

Memory allocation has a bitmap with one bit per cell. Allocating a cell consists of finding the next free cell using the bitmap, and then marking it as used. The garbage collector first clears the bitmap and then (recursively) marks every used cell in the bitmap. There is no explicit scan phase since that is baked into the allocation. Allocation is fast, assuming the CPU has some kind of FindFirstSet instruction.

It is possible to use smaller cells by using 32 bit "pointers" instead of 64 bit pointers. This has a performance penalty, though.

Portability

The C code for the evaluator does not use any special features, and should be portable to many platforms. It has mostly been tested with MacOS and Linux, and somewhat with Windows.

The code has been tested on 64- and 32-bit little-endian platforms, and on 64-bit big-endian.

The src/runtime/ directory contains configuration files for different platform. Use the appropriate src/runtime/platform directory.

Clone this wiki locally