This document outlines the implementation plan for the Leanur project, which aims to compile Lean4 programs to Nock, a minimalist combinator calculus.
We already have a solid implementation of Nock in Leanur.Nock.lean, which defines:
namespace Leanur.Nock
abbrev Atom := Nat -- Base atomic values as natural numbers
inductive Noun -- The core Nock data type
| atom : Atom -> Noun
| cell : Noun -> Noun -> NounThese are sufficient for representing all Nock data.
For compilation, we need to handle Lean4's LCNF (Low-level Continuation-passing Normal Form), which is an intermediate representation used by the Lean4 compiler. The key types from Lean's LCNF include:
- Decl: Top-level declarations (functions, constants)
- FunDecl: Function declarations with parameters and body
- Code: Represents code blocks (let bindings, function calls, pattern matching)
- LetValue: Right-hand sides of let bindings (constants, function calls, projections)
- LetDecl: A binding (variable and its value)
- Alt: Branches in pattern matching
We'll define a monadic structure similar to Yatima's CodeGenM for our compilation process:
structure CompilerCtx where
env : Lean.Environment -- Lean environment containing declarations
overrides : Lean.NameMap Override -- Special handling for built-in ops
functionInfos : HashMap FunctionId FunctionInfo -- Function metadata
constructorInfos : HashMap Tag ConstructorInfo -- Constructor metadata
tempVarMap : HashMap Int TempRef -- Manages temporary variable stack
tempVarsNum : Int -- Counter for temporary variables
stackHeight : Int -- Current stack height for Nock compilation
structure CompilerState where
appendedBindings : Array (Lean.Name × Noun) -- Compiled bindings
visited : Lean.NameSet -- Visited declarations
replaced : Lean.NameMap Lean.Name -- Renamed declarations
abbrev CompilerM := ReaderT CompilerCtx (StateRefT CompilerState IO)-
mkDecl: Main entry point for compiling a declaration
def mkDecl (declName : Lean.Name) : CompilerM Noun
-
compile: Recursive function to convert LCNF code to Nock
def compile : Code → CompilerM (Noun)
-
mkLetValue: Handle let bindings (constants, function calls, projections)
def mkLetValue : LetValue → CompilerM Noun
-
mkFunDecl: Compile function declarations
def mkFunDecl : FunDecl → CompilerM Noun
-
mkCase: Compile pattern matching
def mkCase : Cases → CompilerM Noun
The heart of the Nock compilation strategy is stack-based since Nock has no built-in variable binding:
-
withTemp: Push a value onto the stack for temporary use
def withTemp : Noun → (TempRef → CompilerM Noun) → CompilerM Noun
-
withTempVar: Push and track a temporary variable
def withTempVar : Noun → (TempRef → CompilerM Noun) → CompilerM Noun
-
tempRefPath: Calculate the path to a temporary reference
def tempRefPath : TempRef → CompilerM Path
-
Function Representation: Represent closures as structured data with fields:
- FunCode: The function's code
- ArgsTuple: The arguments tuple
- ClosureRemainingArgsNum: Remaining argument count for partial application
- FunctionsLibrary: Available functions
-
callFunWithArgs: Function call mechanism
def callFunWithArgs : FunctionId → List Noun → CompilerM Noun
-
curryClosure: Support for partial application
def curryClosure : Noun → List Noun → Noun → CompilerM Noun
-
Constructor Representation: Encode constructors as Nock cells with tags
def mkConstructor : Tag → List Noun → CompilerM Noun
-
Memory Representation Selection: Choose appropriate representation styles:
- Standard constructors
- Tuples (single constructor types)
- Specialized representations for common types (List, Option, etc.)
Similar to Yatima's approach, we'll need overrides for basic operations:
def addNockPrelude (env : CompilerEnv) : CompilerEnv :=
-- Add standard overrides for primitive operations
let env := { env with
overrides := env.overrides.insert `Nat.add (.expr <special implementation>) }- Define Nock interpreter (already done)
- Implement core monad and data structures
- Create basic compiler infrastructure
- Implement simple expressions (constants, literals)
- Support for basic function calls
- Add let binding support
- Implement pattern matching (algebraic data types)
- Add support for closures and currying
- Handle recursive functions
- Add specialized representations for common types
- Implement standard library functions
- Add optimizations for better code generation
Low-level Continuation-passing Normal Form (LCNF) is Lean4's intermediate representation where:
- Functions are in continuation-passing style
- All intermediate values are named
- Control flow is explicit
- Pattern matching is broken down into simpler constructs
-
Stack-Based Model: Nock uses a subject-based model where all operations act on a "subject" (like a stack or environment)
-
Environment Building: The compiler constructs a subject containing:
- Function definitions
- Argument values
- Temporary values
-
Function Calls: Function calls in Nock require:
- Locating the function code
- Building an environment with arguments
- Applying the Nock call operator
-
Variable Access: Variables are accessed by path (location in the subject)
-
Pattern Matching: Implemented as a series of Nock equality tests and conditionals
This function is crucial as it manages the stack-based nature of Nock. It:
- Pushes a value onto the stack
- Increases the stack height
- Compiles the body with the updated context
- Returns a Nock expression that uses Nock's Push operator (8)
The recursive compilation function translates Lean's LCNF to Nock by:
- Pattern matching on the LCNF code structure
- Dispatching to specialized handlers for each construct
- Composing the results into valid Nock expressions
Implements the function calling convention by:
- Collecting and packaging arguments
- Locating the function in the library
- Building a new environment with the arguments
- Using Nock's call operator (9) to invoke the function
For a function like:
def add (x y : Nat) : Nat := x + yThe compilation process will:
- Translate to LCNF (handled by Lean)
- Compile the function body (addition)
- Create a closure representation
- Generate Nock code for the function call mechanism
- Produce a complete Nock program that can be interpreted
This implementation plan provides the foundation for building a complete Lean4 to Nock compiler, drawing on the techniques used in both Juvix and Yatima.