Skip to content

Add macros #9

@NewDefectus

Description

@NewDefectus

Macros should be supported in both AT&T and Intel (NASM-like) syntax.

Syntax

There are two types of macros: multiline and inline (inline supported only in Intel syntax with the %define directive). The multiline macro is created using the .macro directive in AT&T or %macro in Intel and closed off by .endm or %endmacro respectively.

All macros may have a list of parameters (each of which is an array of tokens). The way these are accessed changes across the three cases:

  • Multiline in AT&T: Parameters are assigned a single-token name, and within the macro definition are referenced with the prefix \. For example:
.macro genByte nibble1, nibble2
.byte \nibble1 | \nibble2 << 4
.endm

genByte 4, 5    # 45
  • Multiline in Intel: Parameters are assigned a number indicating their index; they are referenced with the prefix % (note that the next function may need to be contextualized a bit to disambiguate e.g. 5%3 from 5 (%3)). For example:
%macro genByte 2
db %1 | %2 << 4
%endmacro

genByte 4, 5    # 45
  • Inline in Intel: Parameters are assigned a single-token name, and are accessed simply by referencing their name, no prefix needed. For example:
%define genByte(nibble1, nibble2) db nibble1 | nibble2 << 4

genByte(4, 5)    # 45

Behavior

Definition

When a macro is defined, the assembler should record all the tokens up to the closing directive (.endm or %endmacro, or a newline for inline macros) and store them in a map. This map should behave similarly to the symbols map: every token that is read by the parser should be recorded as a reference to a macro with the same name. If the macro is later defined or redefined, all lines containing references to the macro should be recompiled entirely (i.e. from the source). All newlines stored in the macro should be replaced with semicolons so that the byte output of a macro corresponds to a single line.

Macro frames

When a macro is used, the parser should create a "macro frame" state; in this state, the parser should read the tokens from the appropriate macro, replacing them with the arguments given as necessary (each macro frame has its own dictionary of parameters). During this state, the parser will not update the currRange variable; it will remain fixed on the macro reference itself. Any parser error thrown within a macro should cause all macro frames to immediately be destroyed, and the parser should continue from the non-macro source (this is to prevent multiple errors piling up on the same macro).

Recursion

Recursion should also be supported: if a macro references another macro (or itself), a new macro frame should be created for that macro, with its own parameter dictionary. .if/%if and .endif/%endif directives would be helpful (note that they should take in a constant, non-IP relative expression in order to prevent paradoxes). There should be a limit on how many macro frames are allowed at any given moment (perhaps 100?); if this limit is reached, an error is thrown and all macro frames are destroyed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions