Skip to content

SVML Specification

angelsl edited this page Feb 26, 2020 · 21 revisions

Source Virtual Machine Language

This page serves as a repository for preliminary specification of a virtual machine code (byte code) format for Virtual Machine implementations of Source.

(click on the link)

Program representations

There are two standard representations of a SVML program. VM implementations are free to accept the representation that works best for them.

Assembly code format

TODO

The assembly code consists of an array of arrays. Each element array represents one instruction. Each instruction has the opcode in position 0, followed by the arguments, which might include numbers, boolean values or strings, depending on the instruction.

Bytecode format

TODO

Header:

  • Magic word: 0x5005ACAD
  • Version number: 2 bytes for minor, 2 bytes for major
  • Constant pool count
  • Constant pool
  • Code

The code is a sequence of bytes, with segments of length 1 to 3 representing individual instructions. The first byte is the opcode, and the following bytes are the arguments.

  • Instructions are byte-aligned.
  • All instruction opcodes are one byte long.
  • All operands are in target device endianness.
  • We use the integer and float type names from Rust to denote operand types in instruction entries.
    • E.g. u8 refers to an 8-bit unsigned integer; i32 refers to a 32-bit signed integer; f32 refers to a 32-bit (single-precision) floating point.
  • An address is a 32-bit unsigned integer u32 that refers to an offset from the start of the program.
  • An offset is a 32-bit signed integer u8 that refers to an offset from the start of the next instruction.

Instructions should be concatenated with no padding between instructions, as well as their operands. Operands should be encoded in target device endianness. For example, the following instructions

ldc.i 123
pop

should result in the following (hex) bytes, when targetting a little-endian device:

01 7B 00 00 00 08

See here for comparison: https://en.wikipedia.org/wiki/Java_class_file

Constant pool

Each constant pool entry has:

  • 1 byte: Type of constant pool entry
  • 2 bytes: Length of constant pool entry in bytes (including 1 + 2)
  • remaining bytes: data of constant pool entry (for example the string, in unicode)
Clone this wiki locally