Skip to content

tenstorrent/tt-lang

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

234 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

tt-lang

License Python Version Build Status

A Python-based Domain-Specific Language (DSL) for authoring high-performance custom kernels on Tenstorrent hardware. This project is under active development — see the functionality matrix for current simulator and compiler support.

1. Vision

TT-Lang joins the Tenstorrent software ecosystem as an expressive yet ergonomic middle ground between TT-NN and TT-Metalium, aiming to provide a unified entrypoint with integrated simulation, performance analysis, and AI-assisted development tooling.

ecosystem graph

The language is designed to support generative AI workflows and a robust tooling ecosystem: Python as the host language enables AI tools to translate GPU DSL kernels (Triton, CUDA, cuTile, TileLang) to Tenstorrent hardware more reliably than direct TT-Metalium translation, while tight integration with functional simulation will allow AI agents to propose kernel implementations, validate correctness, and iterate on configurations autonomously. Developers should be able to catch errors and performance issues in their IDE rather than on hardware, with a functional simulator to surface bugs early. Line-by-line performance metrics and data flow graphs can guide both programmers and AI agents to easily spot bottle necks and optimization opportunities.

Tenstorrent developers today face a choice between TT-NN which provides high-level operations that are straightforward to use but lack the expressivity needed for custom kernels and TT-Metalium which provides full hardware control through explicit low-level management of memory and compute. This is not a shortcoming of TT-Metalium; it is designed to be low-level and expressive, providing direct access to hardware primitives without abstraction overhead, and it serves its purpose well for developers who need that level of control. The problem is that there is no middle ground where the compiler handles what it does best—resource management, validation, optimization—while maintaining high expressivity for application-level concerns.

TT-Lang bridges this gap through progressive disclosure: simple kernels require minimal specification where the compiler infers compute API operations, NOC addressing, DST register allocation and more from high-level abstractions, while complex kernels allow developers to open the hood and craft pipelining and synchronization details directly. The primary use case is kernel fusion for model deployment. Engineers porting models through TT-NN quickly encounter operations that need to be fused for performance or patterns that TT-NN cannot express, and today this requires rewriting in TT-Metalium which takes weeks and demands undivided attention and hardware debugging expertise. TT-Lang makes this transition fast and correct: a developer can take a sequence of TT-NN operations, express the fused equivalent with explicit control over intermediate results and memory layout, validate correctness through simulation, and integrate the result as a drop-in replacement in their TT-NN graph.

2. Quick Start

The fastest way to try tt-lang is with the functional simulator, which runs kernels as pure Python — no hardware, no compiler build required:

git clone https://github.com/tenstorrent/tt-lang.git
cd tt-lang
cmake -G Ninja -B build -DTTLANG_SIM_ONLY=ON
source build/env/activate
ttlang-sim examples/eltwise_add.py

To compile and run kernels on Tenstorrent hardware, use a pre-built Docker image. Two images are available:

Image Purpose Can run tt-lang programs? Can clone/build tt-lang?
dist Run tt-lang programs Yes No
ird Develop and build tt-lang from source Yes Yes

Both images can be used with ird reserve (see container build docs for details).

2.1 dist Pre-built tt-lang (for users)

Image: ghcr.io/tenstorrent/tt-lang/tt-lang-dist-ubuntu-22-04:latest (all versions)

The dist image contains a single, fully built tt-lang installation in /opt/ttlang-toolchain. Use it to compile and run any tt-lang program without building any of the prerequisites.

⚠️ Important: Do not attempt to build tt-lang inside a dist container — it has no build toolchain. To clone and build tt-lang yourself, use the ird image instead.

Create the container (one-time):

docker run -d --name $USER-dist \
  --device=/dev/tenstorrent/0:/dev/tenstorrent/0 \
  -v /dev/hugepages:/dev/hugepages \
  -v /dev/hugepages-1G:/dev/hugepages-1G \
  -v $HOME:$HOME \
  ghcr.io/tenstorrent/tt-lang/tt-lang-dist-ubuntu-22-04:latest \
  sleep infinity

Open a shell:

docker exec -it $USER-dist /bin/bash

The environment activates automatically on login. Run an example immediately:

python /opt/ttlang-toolchain/examples/tutorial/multicore_grid_auto.py

To learn more, work through the tutorial, explore the programming guide for compiler options, debugging, and performance tools, or use Claude Code with the built-in slash commands to translate kernels, profile, and optimize.

2.2 ird Development image (for building tt-lang)

Image: ghcr.io/tenstorrent/tt-lang/tt-lang-ird-ubuntu-22-04:latest (all versions)

The ird image has the pre-built toolchain (LLVM, tt-metal, Python venv) but does not include tt-lang itself. Clone the repository and build against the toolchain. You can maintain multiple clones or branches side by side, each with its own build directory.

To use directly with docker on your local linux machine, first create a container (one-time):

docker run -d --name $USER-ird \
  --device=/dev/tenstorrent/0:/dev/tenstorrent/0 \
  -v /dev/hugepages:/dev/hugepages \
  -v /dev/hugepages-1G:/dev/hugepages-1G \
  -v $HOME:$HOME \
  -v $SSH_AUTH_SOCK:/ssh-agent -e SSH_AUTH_SOCK=/ssh-agent \
  ghcr.io/tenstorrent/tt-lang/tt-lang-ird-ubuntu-22-04:latest \
  sleep infinity

Open a shell:

docker exec -it $USER-ird /bin/bash

Inside the container, clone and build:

git clone https://github.com/tenstorrent/tt-lang.git
cd tt-lang
cmake -G Ninja -B build -DTTLANG_USE_TOOLCHAIN=ON
source build/env/activate
cmake --build build

Verify the build:

ninja -C build check-ttlang-all

Run an example:

python examples/tutorial/multicore_grid_auto.py

The -DTTLANG_USE_TOOLCHAIN=ON flag tells CMake to use the pre-built LLVM and tt-metal from /opt/ttlang-toolchain instead of building them from source, which saves significant build time.

Performance tracing (Tracy) is enabled by default. To disable it, add -DTTLANG_ENABLE_PERF_TRACE=OFF to the cmake configure command. See the programming guide for profiling usage.

2.3 Building without Docker

To build tt-lang directly on a host machine without Docker, see the build system documentation. It covers prerequisites, all supported build modes (from submodules, reusable toolchain, pre-built toolchain), and version compatibility.

2.4 Container Tips

To map a different TT device, change the --device argument (e.g., --device=/dev/tenstorrent/1:/dev/tenstorrent/0).

2.5 Functional Simulator

tt-lang includes a functional simulator that runs kernels as pure Python, without requiring Tenstorrent hardware or the full compiler stack. Use it to validate kernel logic and debug with any Python debugger:

ttlang-sim examples/eltwise_add.py

The simulator typically supports more language features than the compiler at any given point — see the functionality matrix for current coverage. See the programming guide for debugger setup and details.

3. Documentation

Full documentation is built with Sphinx. The source lives in docs/sphinx/ and covers:

To build and view the Sphinx docs locally:

cmake -G Ninja -B build -DTTLANG_ENABLE_DOCS=ON
cmake --build build --target ttlang-docs
python -m http.server 8000 -d build/docs/sphinx/_build/html

4. Contributing

We welcome contributions. Please see CONTRIBUTING.md for guidelines.

4.1 Developer Guidelines

See the Sphinx contributor guide and code style guidelines for coding standards, dialect design patterns, and testing practices.

4.2 Updating Submodule Versions

tt-mlir defines the compatible versions of LLVM and tt-metal. When updating tt-mlir, the other submodules should be updated to match.

Update tt-mlir (and read the versions it expects):

cd third-party/tt-mlir && git fetch && git checkout <commit> && cd ../..

# Read the LLVM and tt-metal commits that this tt-mlir version expects:
grep LLVM_PROJECT_VERSION third-party/tt-mlir/env/CMakeLists.txt
grep TT_METAL_VERSION third-party/tt-mlir/third_party/CMakeLists.txt

Update LLVM to the compatible version:

cd third-party/llvm-project && git fetch && git checkout <llvm-sha> && cd ../..

Update tt-metal to the compatible version:

cd third-party/tt-metal && git fetch && git checkout <tt-metal-sha> && cd ../..

Commit all submodule updates together:

git add third-party/tt-mlir third-party/llvm-project third-party/tt-metal
git commit -m "Update submodules to tt-mlir <commit>"

The build system verifies SHA compatibility during configure. If submodule versions are intentionally mismatched, pass -DTTLANG_ACCEPT_LLVM_MISMATCH=ON or -DTTLANG_ACCEPT_TTMETAL_MISMATCH=ON to suppress the check.

4.3 Code Formatting with Pre-commit

tt-lang uses pre-commit to format code and enforce style guidelines before commits.

Install and activate:

pip install pre-commit
cd /path/to/tt-lang
pre-commit install

Pre-commit runs automatically on git commit. It formats Python code with Black, C++ code with clang-format (LLVM style), removes trailing whitespace, and checks YAML/TOML syntax.

If pre-commit modifies files, the commit is stopped. Stage the changes and commit again:

git add -u
git commit -m "Your commit message"

To run manually on all files: pre-commit run --all-files

4.4 Code of Conduct

This project adheres to a Code of Conduct. By participating, you are expected to uphold this code and treat all community members with respect.

5. Support

6. License

This project is licensed under the Apache License 2.0 — see the LICENSE file for details.

Third-party dependencies and their licenses are listed in the NOTICE file.

About

A Python-based Domain-Specific Language (DSL) for authoring high-performance custom kernels on Tenstorrent hardware.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors