This document summarizes important conventions for contributing code or documentation to the Levanter repository. Follow these notes when implementing new features or fixing bugs.
- Get better. Whenever you discover something missing from these guidelines, or the requester suggests a better way to do something, please update this document. The goal is to make it easier for everyone to contribute and maintain the codebase. Generally speaking, you should add bullets or new sections. Be sure to do this when directed to. For example, if directed that you should never relax tolerances in floating point tests, add that to the list.
- Playbooks. Sometimes, there are repeatable tasks (e.g. porting models) for which we follow a standard set of steps.
Please reference
.playbooks/to see what playbooks are available, or see the list below. If you want to add a playbook write a markdown doc named e.g..playbooks/port-models.mdand add a pointer to it in the list below. - Make a plan before giving up. When a request seems too large to complete in one go, create a markdown file in
.agents/projects/that outlines a step-by-step plan. Break the work into agent-sized checklist items. If later asked to continue the task, read the file, complete an item, and mark it off. If you can't finish a step, split it into smaller tasks and update the checklist accordingly.
- Porting a Model to Levanter: A guide for porting model architectures to the Levanter ecosystem using Haliax and Equinox.
- Python version: the project targets Python >=3.11.
- Formatting and Linting: We use
ruffviapre-commit. - Typing: the code base uses
mypyfor static type checking.mypyis run by pre‑commit and the configuration is found inpyproject.toml. - Run
pre-commit run --all-filesbefore committing. The CI workflows run the same checks. - Doc Strings: All public functions, classes, and modules should have docstrings, unless their purpose is painfully obvious. Use Google style for consistency.
- Commenting: Use comments to explain why something is done a certain way, especially if it is not immediately obvious. Avoid commenting on every line of code; focus on the intent and purpose of complex logic. Demarcating logical groups of code with comments is encouraged, unless it is better to refactor the code into smaller functions or classes.
- Mkdocs: We use Mkdocs for documentation. The main documentation is in
the
docsdirectory. Use Markdown for writing docs, and follow the existing structure. When linking to symbols, prefer using mkdocs-style links (e.g. With a custom title:[full.path.object2][]or[Object 1][full.path.object1]) - Documentation: When adding new features, ensure that the documentation is updated accordingly. This includes updating the Mkdocs files and any relevant docstrings. If you add a new module or significant functionality, consider adding a dedicated section in the documentation.
- Tests are executed with
pytest. The default workflow runsuv run pytest tests -m "not entry and not slow and not ray". - In general, never relax tolerances in floating point tests unless specifically discussed with the
team. Use
assert_allclosewith appropriate tolerances for numerical comparisons. We typically use 1e-4 for more complex modules, and 1e-5 for simpler ones. - Tests should be reasonably fast. Mark long-running tests with @pytest.mark.slow so they are excluded from the default suite.
- Always mark tests that depend on pytorch with
@skip_if_no_torchto ensure they are skipped when PyTorch is not available. This is particularly important for tests that require PyTorch-specific functionality. - CI Best Practice: Use
astral-sh/setup-uvto installuvin workflows and runuv python installbefore installing dependencies withuv syncoruv pip. This ensures the expected Python version is available during testing.
- Named tensors: Levanter relies heavily on the Haliax
library. Arrays are usually represented by
NamedArraywith explicitAxisobjects. Prefer writing operations over named axes rather than positional dimensions. - Generic code: many utilities are written with Python generics and dataclasses. Where possible, write reusable functions or classes that operate over TypeVars instead of hard coding concrete types.
- Configurations: configuration files are dataclasses loaded via
draccus. Keep configs declarative and typed. - Datasets: datasets are represented as
AsyncDatasetorSyncDatasetinlevanter.data.dataset. When creating new data pipelines, prefer asynchronous versions and support slicing, shuffling and mapping operations. In general, Async is preferred over Sync. - Logging and tracking: metrics and performance stats are logged via tracker hooks (e.g. WandB or TensorBoard). Use the existing callback/hook framework instead of ad-hoc logging.
- Reproducibility: Levanter aims for deterministic training where possible. Avoid sources of nondeterminism unless explicitly required.
- Prefer Stacked with fold or scan over writing custom loops, for better compile times and gradient checkpointing support
- Avoid data-dependent Python control flow inside jitted code.
- Do not rely on dynamic shapes.
- Do not use dynamic lengths when indexing.
- Use
debug.printif you need to inspect values. - Use jit-safe versions of
jnp.where,hax.where, or similar operations where the number of returns depends on data.
Any method inside an equinox.Module, any function decorated with jax.jit or one of its variants (e.g. eqx.filter_jit or jax.named_jit), and any helpers they call must follow these jit-safety rules.
- Use
NamedArrayandAxisfor model parameters and activations. - We use Equinox and Haliax, not Flax and not Haiku, for neural network layers and models.
- Prefer functional style JAX code with explicit PRNG keys.
- Avoid hard coding dataset paths; accept them via configuration.
- When extending the library, maintain compatibility with both GPU and TPU backends.