k816 is a fresh Rust implementation of a high-level assembler for the WDC 65816 microprocessor. It is not a port, rewrite, or continuation of any previous compiler. There is no legacy code in this Rust codebase and no need for compatibility layers with prior implementations.
Follow standard Rust conventions with rustfmt for formatting and cargo clippy for style guidance. Apply clippy suggestions to new and modified code. Example: crates/core/src/lib.rs uses clean module declarations and public exports.
- Default behavior: search the existing codebase for matching patterns, utilities, and functions before introducing new ones.
- Prefer extending or reusing existing implementations when behavior aligns, instead of creating parallel code paths.
- When touching related code, refactor opportunistically to reduce duplication if it can be done safely and clearly.
Multi-crate workspace for separation of concerns: core handles lexing/parsing/AST/HIR/semantics/lowering/encoding/emit (depends on eval and assets); eval for compile-time expression evaluation; fmt for pretty-printing; assets for data converters; isa65816 for instruction definitions; o65 for object format; link for RON-based linking. Design emphasizes high-level assembly with C-like syntax, compile-time evaluation, structured blocks, deterministic output, explicit far functions, and 24-bit addressing. No optimizer or linker-based far-calls. Reference: Cargo.toml.
Repository Layout:
vendor/contains the source code of a reference K65 compiler written in C. It is kept for reference purposes only and has no bearing on the Rust implementation.crates/contains the Rust workspace crates listed above.docs/contains documentation for k816's syntax and linker configuration.examples/contains example K65 programs.tests/golden/contains golden test fixtures for regression testing.
The compiler accepts two source styles that can be mixed in a single file:
- the native 65816 mnemonic syntax (
lda,inx,pha, ...) - the high-level K65-style syntax (
a = ...,x++,a!!, ...)
Equivalent native/HLA constructs produce equivalent machine output, but they stay distinct in AST:
- native mnemonics are represented as
Stmt::Instruction - HLA sugar is represented as
Stmt::Hla(...)variants (register assignments/transfers, stack shorthand, flow shorthand, one-letter operators, nop shorthand, chains, etc.)
Convergence happens in lowering (AST -> HIR), not in parsing.
Current frontend/backend flow:
Core compile pipeline in k816_core (see crates/core/src/driver.rs):
- Parse with warnings (
parse_with_warnings) into AST. - Expand evaluator-backed expression fragments (
eval_expand::expand_file). - Normalize HLA constructs into canonical HLA AST forms (
normalize_hla::normalize_file). - Run semantic analysis (
sema::analyze) to build validated symbol/function/const/var metadata. - Lower AST + semantic model into operation-level HIR (
lower::lower), translating HLA nodes and native instructions through the same instruction/mode validation path. - Optimize mode ops with
eliminate_dead_mode_opsandfold_mode_ops. - Emit relocatable object/chunk metadata via
emit_object.
Link/render orchestration (CLI, golden harness, and project build flows using k816_link):
- Link object(s) into canonical placed layout (symbols/placements/listing), format-agnostic.
- Render final container format (
raw/xex) only at output write time.
- AST is syntax-oriented: spans, comments, numeric literal formats, and source-level constructs for diagnostics/formatting are preserved.
- Semantic analysis resolves symbols, const values, variable layouts, and function mode contracts before lowering.
- HIR (
hir::Program) is operation-oriented: segments, labels, instructions, emitted bytes, mode ops, and relocatable byte fixups. - Width/mode-sensitive diagnostics (typed loads/stores,
:fardirect access rejection, etc.) are enforced during lowering for both native and HLA forms. - Core compile backend is single-path object/chunk emission (
emit_object-style). - Linker output is layout-first;
raw/xexare render targets selected at final write time, not compile backends. - Emission handles instruction selection/encoding and fixup materialization; linker placement and symbol resolution are handled in the
linkcrate from RON config. - Structured top-level blocks (
func,data, named data blocks) remain the units used by lowering and downstream placement/link flows.
- Build:
cargo build - Test:
cargo test - Golden tests:
cargo test -p k816-golden-testsfor reproducible output validation against fixtures. Uses Rust edition 2024 with dependencies likechumskyfor parsing,clapfor CLI. Reference: Cargo.toml. - Golden fixture binaries: Use tests/golden/link.stub.ld.ron as the linker config (
-Tflag) to generateexpected.binfiles in raw binary format for golden test fixtures. - Golden bless through
cargo test(feature-gated):- Regenerate all golden outputs:
cargo test -p k816-golden-tests --features golden-bless - Regenerate selected cases via test filter:
cargo test -p k816-golden-tests --features golden-bless syntax_doc_example_dli - Validate after blessing:
cargo test -p k816-golden-tests
- Regenerate all golden outputs:
- Golden bless CLI (secondary, script-friendly):
- Regenerate all golden outputs:
cargo run -p k816-golden-tests --bin bless -- - Regenerate only error fixtures (
expected.err):cargo run -p k816-golden-tests --bin bless -- --err-only - Regenerate a single fixture:
cargo run -p k816-golden-tests --bin bless -- --case fixture:<name> - List discoverable cases:
cargo run -p k816-golden-tests --bin bless -- --list
- Regenerate all golden outputs:
- Follow Test-Driven Development (TDD): write or update tests first, then implement behavior to satisfy them.
- Ensure implementations are thoroughly testable through
tests/golden/fixtures. - Cover positive desired paths, expected and handled errors, and edge cases in golden fixtures.
- No "compat", "legacy", or "deprecated" shims; features are first-class or removed.
segmentdirective only for output segments; nobankalias.- Error on unknown directives; no silent ignores.
- Syntax: C-style comments, variables with
var, constants in[...], labels (globallabel:, local.label:), numeric literals (decimal, hex$or0x, binary%or0b). Example: examples/hello_uart.k65 and docs/k65-syntax-reference.md.
Linker uses RON format for configuration (.ld.ron files). vendor/ contains reference C compiler for syntax validation only; no integration with Rust code. Reference: docs/linker-ron-format.md.
- Keep primary error messages concise and focused on what is wrong.
- Put remediation guidance in Ariadne help text using
Diagnostic::with_help(...). - Avoid embedding "how to fix" details directly in the main error message when help text can carry that context.
-
Direct-page auto-shrink with
:absoverrides was implemented for address operands. -
Implementation path:
- Syntax/reference docs:
docs/syntax-reference.md - AST additions for declaration/use-site address hints:
crates/core/src/ast/mod.rs - Parser support for
var name:absandexpr:...:abs:crates/core/src/parser/data.rs,crates/core/src/parser/data/vars.rs,crates/core/src/parser/expr.rs - Semantic propagation of var-level defaults:
crates/core/src/sema/model.rs,crates/core/src/sema/vars.rs - Lowering/immediate classification updates so
:abskeeps operands address-based:crates/core/src/lower.rs - HIR/object/encoder address-size hint flow:
crates/core/src/hir/mod.rs,crates/core/src/emit_object.rs,crates/isa65816/src/lib.rs - Pretty-printing/tests:
crates/fmt/src/print_ir.rs,tests/golden/fixtures/emit-addr-direct/
- Syntax/reference docs:
-
Semantics:
- Eligible page-0 direct operands now default to direct-page encodings when an opcode exists.
- Use-site
expr:absforces 16-bit absolute-family encoding and has highest precedence. - Declaration-level
var name:abs = ...makes plain references inherit that 16-bit absolute preference. :absis an address-encoding preference only; it does not change variable data width.
-
Constraints:
- Declaration-level
:absis currently supported only on top-levelvardeclarations. - Symbolic-subscript field declarations do not yet support
.field:...:abs. :farcontinues to mean 24-bit data width / long-address preference where already supported;:wordstill means data width only.
- Declaration-level
-
VS Code LM tools were added for Copilot agent mode:
k816_lookup_instructionk816_query_memory_map
-
Extension implementation path:
- Manifest/tool contributions and activation events:
editors/vscode-k816/package.json - Tool runtime registration and formatting:
editors/vscode-k816/src/extension.ts - Bundled instruction reference data:
editors/vscode-k816/resources/instructions-description.json - Extension docs:
editors/vscode-k816/README.md
- Manifest/tool contributions and activation events:
-
LSP implementation path:
- New custom request
k816/queryMemoryMap:crates/lsp/src/lib.rs - LSP request/response types:
QueryMemoryMapParams { memory_name?: string, detail?: "summary" | "runs" }QueryMemoryMapResult { status, reason?, memories, runs }
- Link state retention (
last_link_layout,last_link_error) and pending-change flush before query are incrates/lsp/src/lib.rs.
- New custom request
-
Constraints:
- Tools are read-only (no build/run side effects).
- Requires VS Code
^1.109.0and Copilot chat/agent availability. - Instruction lookup uses bundled static JSON (no external fetch).
- Memory map output depends on current in-memory LSP link state; returns
unavailablewith reason when link data is absent/failed.
-
LSP feature batch for extension features 7-17 was implemented (references/rename, completion/docs, semantic tokens, diagnostics freshness updates, inlay hints, hover metadata, code lens, folding ranges, signature help).
-
Implementation path:
- Core LSP capability wiring, handlers, and feature logic:
crates/lsp/src/lib.rs - Bundled instruction hover metadata (flags/cycles):
crates/lsp/resources/instruction-metadata.json - LSP integration coverage for new requests/capabilities:
tests/lsp_cli.rs - VS Code code-lens command wiring:
editors/vscode-k816/src/extension.ts,editors/vscode-k816/package.json
- Core LSP capability wiring, handlers, and feature logic:
-
Constraints:
- Rename is safe symbol-only (resolved canonical symbols only); no fuzzy text rename.
- References include declaration sites when requested by LSP context.
- Folding ranges are brace-structure based;
#if ... #endifpreprocessor folding is not implemented yet. - Signature help currently covers evaluator built-ins only (e.g.
sin,cos,clamp), not macro signatures. - Opcode hover cycle/flag details come from bundled metadata and are best-effort for covered instructions.
-
DAP debug enhancements were added for features 18/19/20/23 (disassembly, memory read/write, register view, inline debug values).
-
Implementation path:
- Extension DAP proxy + inline values provider:
editors/vscode-k816/src/extension.ts - Extension settings/docs updates:
editors/vscode-k816/package.json,editors/vscode-k816/README.md - LSP custom request for inline symbol resolution:
k816/resolveInlineSymbolsincrates/lsp/src/lib.rs - Emulator DAP/backend wiring (external repo):
/home/smoku/devel/X65/devel/emu/src/dap.cc,/home/smoku/devel/X65/devel/emu/src/common/webapi.h,/home/smoku/devel/X65/devel/emu/src/x65.c
- Extension DAP proxy + inline values provider:
-
Constraints:
- Scope intentionally excludes feature 21 (conditional breakpoints) and 22 (data breakpoints/watchpoints).
- Inline symbol memory reads are limited to
read_size_hint1 or 2; larger symbols currently render address-only inline text. - Inline values depend on a stopped
k816debug session and current LSP symbol/address state.