Copilot Instructions for SlangPy

This file provides guidance to GitHub Copilot when working with code in this repository.

Project Overview

SlangPy is a native Python extension that provides a high-level interface for working with low-level graphics APIs (Vulkan, Direct3D 12, CUDA). The native side wraps the slang-rhi project (external/slang-rhi) using nanobind bindings. The project also contains a "functional API" that allows users to call Slang functions on the GPU with Python function call syntax.

Directory Structure

Directory	Description
`src/sgl/`	Native C++ code (core GPU abstraction layer)
`src/slangpy_ext/`	Python bindings (nanobind)
`src/slangpy_torch/`	Native torch integration extension
`slangpy/`	Python package implementation
`slangpy/tests/`	Python tests (pytest)
`tests/`	C++ tests (doctest)
`tools/`	General utility scripts
`.github/workflows/`	CI workflows
`examples/`, `samples/examples/`, `samples/experiments/`	Example code
`docs/`	Documentation
`external/`	External C++ dependencies

Communication Style

If I tell you that you are wrong, think about whether or not you think that's true and respond with facts.
Avoid apologizing or making conciliatory statements.
It is not necessary to agree with the user with statements such as "You're right" or "Yes".
Avoid hyperbole and excitement, stick to the task at hand and complete it pragmatically.

Key Rules

New Python APIs must have tests in slangpy/tests/
Always build before running tests
Run pre-commit after completing tasks (pre-commit run --all-files; re-run if it modifies files)
Use type annotations for all Python function arguments
Minimize new dependencies — the project has minimal external deps

Architecture

The project has three main layers:

Python Layer (slangpy/) — High-level API with Module, Function, Device classes
C++ Binding Layer (src/slangpy_ext/) — Nanobind-based Python-C++ interface
Core SGL Layer (src/sgl/) — Low-level GPU device management and shader compilation

C++ types typically map to slang-rhi counterparts (e.g., Device wraps rhi::IDevice).

Key Components

Module: Container for Slang shader code, loaded from .slang files
Function: Callable GPU function with automatic Python↔GPU marshalling
Device: GPU context managing resources and compute dispatch
CallData: Cached execution plans for optimized repeated calls
Buffer/Texture: GPU memory resources with Python array interface

Error Handling

Python-layer errors raise standard Python exceptions (typically ValueError, TypeError, or SlangPyError).
C++ errors are translated to Python exceptions via nanobind. Shader compile errors surface as exceptions containing the Slang compiler diagnostic text.
GPU errors (device lost, out of memory) propagate as exceptions from the RHI layer.
When debugging, set SLANGPY_PRINT_GENERATED_SHADERS=1 to see the generated kernel code that gets compiled.

Build Commands

Replace <PLATFORM> with windows-msvc, linux-gcc, or macos-arm64-clang as appropriate:

cmake --preset <PLATFORM>                    # Configure
cmake --build --preset <PLATFORM>-debug      # Build (debug)
cmake --preset <PLATFORM> --fresh            # Reconfigure from scratch

Available presets: windows-msvc, windows-arm64-msvc, linux-gcc, macos-arm64-clang.

Testing

Always build before running tests.

pytest slangpy/tests -v                        # All Python tests
pytest samples/tests -vra                      # Example tests
python tools/ci.py unit-test-cpp               # C++ unit tests
pytest slangpy/tests/slangpy_tests/test_X.py -v          # Specific file
pytest slangpy/tests/slangpy_tests/test_X.py::test_fn -v # Specific function

Debug generated shaders (PowerShell):

$env:SLANGPY_PRINT_GENERATED_SHADERS="1"; pytest slangpy/tests/slangpy_tests/test_X.py -v

Code Style

C++

Classes: PascalCase | Functions/variables: snake_case | Members: m_ prefix

Python

Classes: PascalCase | Functions/variables: snake_case | Public Members: no prefix | Private Members: _ prefix
All arguments must have type annotations

Documentation Style

C++ (Doxygen)

/// Description.
void do_something();

/// Pack two float values to 8-bit snorm.
/// @param v Float values in [-1,1].
/// @param options Packing options.
/// @return 8-bit snorm values in low bits, high bits all zero.
uint32_t pack_snorm2x8(float2 v, const PackOptions options = PackOptions::safe);

Python (Sphinx)

def myfunc(x: int, y: int) -> int:
    """
    Description.

    :param x: Some parameter.
    :param y: Some parameter.
    :return: Some return value.
    """

Slang Language Basics

Slang is a shader language based on HLSL. Key patterns used in this project:

[shader("compute")] attribute marks GPU entry points
StructuredBuffer<T> / RWStructuredBuffer<T> for typed GPU arrays
uint3 tid : SV_DispatchThreadID for thread indexing
Generics via <T>, interfaces via interface IFoo, conformance via struct Foo : IFoo
Differentiable functions: [Differentiable] float foo(float x) with bwd_diff(foo) for backprop
See .slang files in slangpy/tests/ for project-specific patterns

CI System

The CI uses GitHub Actions (.github/workflows/ci.yml) and calls tools/ci.py:

python tools/ci.py configure  # CMake configure
python tools/ci.py --help     # All available commands

Dependencies

Python runtime: requirements.txt | Dev/tests: requirements-dev.txt
C++ dependencies: external/
Testing: pytest (Python), doctest (C++)
Shading language: Slang
Formatting: pre-commit hooks (Black for Python, clang-format for C++)

Debugging Slang Compiler Issues

Many issues in SlangPy originate from the Slang compiler itself (shader-slang/slang repo).

Local Slang Build

# 1. Query for path to built local slang (eg c:/sw/slang)

# 2. Reconfigure fresh with local slang
cmake --preset windows-msvc --fresh -DSGL_LOCAL_SLANG=ON -DSGL_LOCAL_SLANG_DIR=<slang dir> -DSGL_LOCAL_SLANG_BUILD_DIR=build/Debug

# 3. build as normal
cmake --build --preset windows-msvc-debug

CMake Option	Default	Description
`SGL_LOCAL_SLANG`	OFF	Enable to use a local Slang build
`SGL_LOCAL_SLANG_DIR`	`../slang`	Path to the local Slang repository
`SGL_LOCAL_SLANG_BUILD_DIR`	`build/Debug`	Build directory within the Slang repo

Workflow

Reproduce the issue in SlangPy
Clone and build Slang locally (above)
Read external/slang/CLAUDE.md
Edit Slang source → rebuild Slang → rebuild SlangPy → test

Development Tips

Use python tools/ci.py for most build/test tasks — handles platform-specific config
PyTorch integration is automatic when PyTorch is installed
Hot-reload is supported for shader development

The Functional API

The functional API allows calling Slang GPU functions from Python with automatic type marshalling, vectorization, and kernel generation.

Overview

// myshader.slang
float add(float a, float b) { return a + b; }

import slangpy as spy
import numpy as np

device = spy.Device()
module = spy.Module.load_from_file(device, "myshader.slang")
a = spy.Tensor.from_numpy(device, np.array([1, 2, 3], dtype=np.float32))
b = spy.Tensor.from_numpy(device, np.array([4, 5, 6], dtype=np.float32))
result = module.add(a, b)  # Returns Tensor([5, 7, 9])

Architecture

Python call → Phase 1: Signature Lookup (C++) → Cache hit? → Phase 3: Dispatch (C++)
                                               ↓ Cache miss
                                        Phase 2: Kernel Generation (Python)

Key Files

File	Purpose
`slangpy/core/function.py`	`FunctionNode` — Python entry point for function calls
`slangpy/core/calldata.py`	`CallData` — kernel generation and caching
`slangpy/core/callsignature.py`	Type resolution and binding helpers
`slangpy/bindings/boundvariable.py`	`BoundCall`/`BoundVariable` — tracks Python↔Slang bindings
`slangpy/bindings/marshall.py`	`Marshall` base class for type marshalling
`slangpy/bindings/typeregistry.py`	Maps Python types to their Marshall implementations
`src/slangpy_ext/utils/slangpyfunction.cpp`	`NativeFunctionNode::call` — native call entry
`src/slangpy_ext/utils/slangpy.cpp`	`NativeCallData::exec` — native dispatch

Key Classes

Class	Layer	Purpose
`FunctionNode`	Python	Represents a callable Slang function with modifiers
`CallData`	Python	Generated kernel data (bindings, compiled shader)
`BoundCall`	Python	Collection of `BoundVariable` for a single call
`BoundVariable`	Python	Pairs Python value with Slang parameter
`NativeMarshall`	Both	Type-specific marshalling (shape, data binding)
`NativeCallData`	C++	Native call data with cached dispatch info
`NativeCallDataCache`	C++	Signature → CallData cache

Phase 1: Signature Lookup (C++, every call)

Location: src/slangpy_ext/utils/slangpyfunction.cpp

Runs on every call — must be fast. Builds a unique signature string from the function node chain and argument types/properties, looks it up in NativeCallDataCache. Cache hit skips to Phase 3; cache miss triggers Phase 2.

Phase 2: Kernel Generation (Python, once per signature)

Location: slangpy/core/calldata.py → CallData.__init__()

Runs once per unique call signature. The pipeline:

Unpack arguments — recursively resolve IThis wrappers via get_this()
Build BoundCall — create a BoundVariable per argument, each assigned a NativeMarshall based on Python type (int/float → ScalarMarshall, Tensor → TensorMarshall, dict → recursive children)
Apply explicit vectorization — user-specified function.map() dimension/type mappings
Type resolution (slangpy/reflection/typeresolution.py) — each marshall's resolve_types() determines compatible Slang types. Resolves overloaded functions by best match.
Bind parameters — pair each BoundVariable with its resolved Slang parameter
Apply implicit vectorization — calculate per-argument dimensionality
Calculate call dimensionality — max across all arguments
Create return value binding — auto-creates ValueRef (dim 0) or Tensor (dim > 0) for _result
Finalize mappings — resolve Python→kernel dimension mappings (default: right-aligned)
Calculate differentiability — determine gradient support per argument
Generate code — produce Slang compute kernel source
Compile shader — compile via Slang; cache the CallData

Type Resolution Reference

Python Value	Slang Parameter	Resolved Binding
`Tensor[float, 2D]`	`float`	`float` (elementwise)
`Tensor[float, 2D]`	`Tensor<float,2>`	`Tensor<float,2>` (whole)
`Tensor[float, 2D]`	`float2`	`float2` (row as vector)
`Tensor[float, 2D]`	`vector<T,2>`	`vector<float,2>` (generic)

Vectorization Dimensionality Reference

Python Value	Slang Parameter	Dimensionality
`Tensor[float, 2D shape=(H,W)]`	`float`	2 (one thread per element)
`float`	`float`	0 (single thread)
`Tensor[float, 2D]`	`Tensor<float,2>`	0 (whole tensor per thread)
`Tensor[float, 2D shape=(H,W)]`	`float2`	1 (one thread per row)

Phase 3: Dispatch (C++, every call)

Location: src/slangpy_ext/utils/slangpy.cpp

Runs on every call, entirely in C++:

Unpack arguments (native unpack_args/unpack_kwargs)
Calculate call shape — each marshall's get_shape() returns concrete dimensions; call shape determines thread count
Allocate return value — create output Tensor/ValueRef if _result not provided
Bind uniforms & dispatch — marshalls write data to GPU via write_shader_cursor_pre_dispatch(), then dispatch()
Read results — post-dispatch readback via read_calldata() and read_output()

Adding New Types

To support a new Python type in the functional API:

Create a Marshall in slangpy/bindings/ or slangpy/builtin/ — subclass Marshall and implement resolve_types(), resolve_dimensionality(), gen_calldata(). See existing marshalls (e.g., TensorMarshall) for the pattern.
Register in slangpy/bindings/typeregistry.py — add entry to PYTHON_TYPES dict.
(Optional) Native signature — for performance, add a type signature handler in NativeCallDataCache constructor (src/slangpy_ext/utils/slangpyfunction.cpp).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copilot Instructions for SlangPy

Project Overview

Directory Structure

Communication Style

Key Rules

Architecture

Key Components

Error Handling

Build Commands

Testing

Code Style

C++

Python

Documentation Style

C++ (Doxygen)

Python (Sphinx)

Slang Language Basics

CI System

Dependencies

Debugging Slang Compiler Issues

Local Slang Build

Workflow

Development Tips

The Functional API

Overview

Architecture

Key Files

Key Classes

Phase 1: Signature Lookup (C++, every call)

Phase 2: Kernel Generation (Python, once per signature)

Type Resolution Reference

Vectorization Dimensionality Reference

Phase 3: Dispatch (C++, every call)

Adding New Types

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Copilot Instructions for SlangPy

Project Overview

Directory Structure

Communication Style

Key Rules

Architecture

Key Components

Error Handling

Build Commands

Testing

Code Style

C++

Python

Documentation Style

C++ (Doxygen)

Python (Sphinx)

Slang Language Basics

CI System

Dependencies

Debugging Slang Compiler Issues

Local Slang Build

Workflow

Development Tips

The Functional API

Overview

Architecture

Key Files

Key Classes

Phase 1: Signature Lookup (C++, every call)

Phase 2: Kernel Generation (Python, once per signature)

Type Resolution Reference

Vectorization Dimensionality Reference

Phase 3: Dispatch (C++, every call)

Adding New Types