Add custom derivatives for getOffset by kaizhangNV · Pull Request #10499 · shader-slang/slang

kaizhangNV · 2026-03-11T03:55:26Z

Add custom derivatives for getOffset and simplify ILayer.eval to single parameter address

Add [Differentiable] and [ForwardDerivative(fwd_getOffset)] to getOffset on all IPointerLikeAddress implementations (BindlessAddress, PointerAddress, TorchTensorViewAddress, Ptr extension) so autodiff can propagate DifferentialPtrPair through address offset computation.
Simplify ILayer.eval from eval(input, weightAddr, biasAddr) to eval(input, parameterAddr). FFLayer.eval now internally computes the bias address via parameterAddr.getOffset(weightCount), keeping address arithmetic inside the differentiable function.
Rename basic-ilayer-ffn-training-test to basic-ilayer-ffn-forward-test and add basic-ilayer-ffn-backward-test that verifies gradients through a 2-layer FFN with getOffset called inside the differentiable function.
Update all test call sites to use the new single-address eval API.

…le parameter address - Add [Differentiable] and [ForwardDerivative(fwd_getOffset)] to getOffset on all IPointerLikeAddress implementations (BindlessAddress, PointerAddress, TorchTensorViewAddress, Ptr extension) so autodiff can propagate DifferentialPtrPair through address offset computation. - Simplify ILayer.eval from eval(input, weightAddr, biasAddr) to eval(input, parameterAddr). FFLayer.eval now internally computes the bias address via parameterAddr.getOffset(weightCount), keeping address arithmetic inside the differentiable function. - Rename basic-ilayer-ffn-training-test to basic-ilayer-ffn-forward-test and add basic-ilayer-ffn-backward-test that verifies gradients through a 2-layer FFN with getOffset called inside the differentiable function. - Update all test call sites to use the new single-address eval API.

coderabbitai · 2026-03-11T03:55:49Z

📝 Walkthrough

Walkthrough

Consolidates weights and biases into a single contiguous parameter address for layers and adds forward-mode differentiable support for getOffset across multiple pointer-like address types, via @Differentiable and ForwardDerivative(fwd_getOffset) plus static fwd_getOffset helpers.

Changes

Cohort / File(s)	Summary
Differentiability Support `source/standard-modules/neural/bindless-storage.slang`	Annotates `getOffset` with `@Differentiable` and `ForwardDerivative(fwd_getOffset)` for `BindlessAddress`, `PointerAddress`, `TorchTensorViewAddress`, and the internal `Ptr` extension; adds static `fwd_getOffset(DifferentialPtrPair<This>, int)` implementations that propagate primal and differential components via `DifferentialPtrPair`.
Core Interface & Implementations `source/standard-modules/neural/ilayer.slang`, `source/standard-modules/neural/layers.slang`	Changes `ILayer.eval` and `FFLayer.eval` signatures from `(weightAddr, Optional biasAddr)` to a single `parameterAddr`; internal bias offset is computed from the base parameter block.
New/Updated Tests — ILayer FFN Backward `tests/neural/basic-ilayer-ffn-backward-test.slang`	Adds a new autodiff backward unit test for a two-layer ILayer FFN that exercises differentiable `getOffset` in gradient propagation; sets up parameter buffers, runs backward pass, and asserts expected parameter gradients.
Updated Tests — Forward & Frontend Smoke `tests/neural/basic-ilayer-ffn-forward-test.slang`, `tests/neural/basic-ilayer-frontend-smoke-test.slang`	Refactors tests to pass a single per-layer base address to `eval` instead of separate weight/bias addresses; updates helper signatures (e.g., `evalAsLayer`) and call sites accordingly.
Updated Tests — FFLayer Autodiff Variants `tests/neural/fflayer-autodiff-backward-test.slang`, `tests/neural/fflayer-no-bias-test.slang`, `tests/neural/fflayer-wavetangled-vector-test.slang`	Unifies autodiff wrappers to accept single `paramAddr`; simplifies backward paths to use a single `DifferentialPtrPair` for params so `getOffset` derivatives participate in autodiff.
Storage & Unit Test Adjustments `tests/neural/fflayer-two-storage-forward-test.slang`, `tools/gfx-unit-test/neural-tensorview-address.slang`	Consolidates weight/bias storage into a single `params` uniform and switches address types (e.g., to `BindlessAddress<float>`); updates signatures and callers to use unified parameter addressing.
Activation Test Update `tests/neural/activation-with-fflayer-test.slang`	Updates calls to `layer.eval` to pass a single base parameter address and removes per-parameter address computations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A little hop through offsets and code,
One block of params down the new road,
Derivatives carried, forward they prance,
Pointers pair up and join the dance,
Hooray — a rabbit's happy code-gladance! 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately captures the primary change: adding custom derivatives (via `@Differentiable` and `@ForwardDerivative` attributes) for the getOffset method across address types.
Description check	✅ Passed	The description is directly related to the changeset, detailing the derivative annotations, API simplification, test updates, and the overall objective of enabling autodiff through address offset computation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch autodiff_getoffset

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR updates the neural layer parameter addressing model to support autodiff through address arithmetic, by making getOffset differentiable and switching ILayer/FFLayer.eval to accept a single base parameter address for a contiguous (weights-then-bias) parameter block.

Changes:

Add custom forward derivatives for getOffset across IPointerLikeAddress implementations so autodiff can propagate DifferentialPtrPair through pointer offsets.
Simplify ILayer.eval / FFLayer.eval to eval(input, parameterAddr) and compute bias address internally via parameterAddr.getOffset(weightCount).
Update/extend neural tests to use the new API and add a backward test that exercises getOffset inside the differentiable function.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tools/gfx-unit-test/neural-tensorview-address.slang	Switch unit test to single-parameter-address `eval` and route grads through a single `DifferentialPtrPair`.
tests/neural/fflayer-wavetangled-vector-test.slang	Update WaveTangledVector FFLayer tests to the new `eval(input, baseAddr)` signature.
tests/neural/fflayer-two-storage-forward-test.slang	Convert test to contiguous parameter storage and update call sites to the new API.
tests/neural/fflayer-no-bias-test.slang	Update no-bias layer tests/wrappers to pass a base parameter address.
tests/neural/fflayer-autodiff-backward-test.slang	Update autodiff backward test to exercise `getOffset` inside `eval`.
tests/neural/basic-ilayer-frontend-smoke-test.slang	Update ILayer-constrained call site to new single-address `eval`.
tests/neural/basic-ilayer-ffn-forward-test.slang	Update 2-layer FFN forward test to use per-layer base offsets.
tests/neural/basic-ilayer-ffn-backward-test.slang	New backward test verifying gradients through a 2-layer FFN with internal `getOffset`.
tests/neural/activation-with-fflayer-test.slang	Update activation + FFLayer tests to pass only the base address.
source/standard-modules/neural/layers.slang	Change `FFLayer.eval` signature and compute bias address internally via `getOffset`.
source/standard-modules/neural/ilayer.slang	Change `ILayer.eval` signature to accept a single contiguous-parameter base address.
source/standard-modules/neural/bindless-storage.slang	Add `[Differentiable]` + `[ForwardDerivative]` for `getOffset` on address types.

Copilot · 2026-03-11T03:59:14Z

source/standard-modules/neural/bindless-storage.slang

+    static DifferentialPtrPair<This> fwd_getOffset(DifferentialPtrPair<This> self, int elements)
+    {
+        return DifferentialPtrPair<This>(
+            self.p.getOffset(elements),
+            self.d.getOffset(elements));
+    }


TorchTensorViewAddress.getOffset is marked [require(cuda)], but its forward derivative helper fwd_getOffset is not. On non-CUDA targets this helper may still be type-checked/compiled and it calls getOffset, which is CUDA-only, potentially causing compilation errors. Mark fwd_getOffset with the same [require(cuda)] (or otherwise ensure it is excluded on non-CUDA targets).

Copilot · 2026-03-11T03:59:14Z

tests/neural/fflayer-two-storage-forward-test.slang

@@ -1,4 +1,4 @@
-// Test FFLayer eval() with separate weight/bias address instances.
+// Test FFLayer eval() with contiguous parameter storage.


This test now verifies contiguous parameter storage rather than separate weight/bias storage, but the file name (fflayer-two-storage-forward-test.slang) still suggests the old behavior. Consider renaming the file to match the updated intent to keep the test suite easier to navigate.

Suggested change

// Test FFLayer eval() with contiguous parameter storage.

// Test FFLayer eval() with contiguous parameter storage (note: file name

// `fflayer-two-storage-forward-test.slang` is legacy and now refers to this

// contiguous-parameter layout test).

coderabbitai

🧹 Nitpick comments (1)

tests/neural/basic-ilayer-ffn-forward-test.slang (1)
49-54: Prefer deriving the second layer base from Layer1.ParameterCount.

Line 51 hardcodes the current packing size as 6. Using the layer constant keeps this test aligned if the parameter layout changes.
♻️ Suggested cleanup
     // Compute base addresses for each layer's parameter block
     let layer1Addr = baseAddr.getOffset(0);
-    let layer2Addr = baseAddr.getOffset(6);
+    let layer2Addr = baseAddr.getOffset(Layer1.ParameterCount);

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 804adb37-6a5c-42a1-9b9c-c186c5912b1d

📥 Commits

Reviewing files that changed from the base of the PR and between 38b5b83 and e5cfe4d.

📒 Files selected for processing (12)

source/standard-modules/neural/bindless-storage.slang
source/standard-modules/neural/ilayer.slang
source/standard-modules/neural/layers.slang
tests/neural/activation-with-fflayer-test.slang
tests/neural/basic-ilayer-ffn-backward-test.slang
tests/neural/basic-ilayer-ffn-forward-test.slang
tests/neural/basic-ilayer-frontend-smoke-test.slang
tests/neural/fflayer-autodiff-backward-test.slang
tests/neural/fflayer-no-bias-test.slang
tests/neural/fflayer-two-storage-forward-test.slang
tests/neural/fflayer-wavetangled-vector-test.slang
tools/gfx-unit-test/neural-tensorview-address.slang

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

source/standard-modules/neural/bindless-storage.slang (1)
82-89: ⚠️ Potential issue | 🟠 Major

Keep BindlessAddress.getOffset() signed until the final validation/cast.

Line 84 stores baseIndex + elements into uint, while the other getOffset() implementations keep elements in signed pointer arithmetic. Because IPointerLikeAddress.getOffset() takes int, this makes negative offsets behave differently for BindlessAddress, and fwd_getOffset() will propagate the same wrapped index into the differential address. Either reject negative offsets consistently at the interface boundary or do the arithmetic in a signed type before the final cast.
Possible direction
     public This getOffset(int elements)
     {
-        uint newBaseIndex = baseIndex + elements;
+        int newBaseIndex = int(baseIndex) + elements;
+        // Reject `newBaseIndex < 0` here if bindless addresses are not
+        // allowed to move backwards.
 
         This address = This(handle);
-        address.baseIndex = newBaseIndex;
+        address.baseIndex = uint(newBaseIndex);
         return address;
     }
Also applies to: 91-95

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c517abeb-f425-4fec-a6c0-9bb675d92729

📥 Commits

Reviewing files that changed from the base of the PR and between e5cfe4d and a9ef926.

📒 Files selected for processing (1)

source/standard-modules/neural/bindless-storage.slang

coderabbitai · 2026-03-11T04:20:41Z

source/standard-modules/neural/bindless-storage.slang

+    [Differentiable]
+    [ForwardDerivative(fwd_getOffset)]
    public This getOffset(int elements)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In current Slang autodiff documentation, can a method marked [Differentiable]legally return a type that implementsIDifferentiablePtrType, or are such types restricted to input parameters only?

💡 Result:

In Slang’s current autodiff docs, a function/method marked [Differentiable] may not return a type that implements IDifferentiablePtrType. Types implementing IDifferentiablePtrType are restricted to input (in) parameters only (i.e., they cannot be used as out, inout, or return types), and attempting to do so is a compile-time error. [1]

getOffset() methods cannot be marked [Differentiable] because they return IDifferentiablePtrType.

In Slang's autodiff system, types implementing IDifferentiablePtrType are restricted to input parameters only and cannot be returned from [Differentiable] methods—this is a compile-time error per the documented rules. The current annotations on these methods violate that restriction.

Also applies to: 141–143, 250–252, 330–332

github-actions

PR Review: Add custom derivatives for getOffset

Summary

This PR makes two coupled changes: (1) adds [Differentiable] + [ForwardDerivative(fwd_getOffset)] to getOffset on all four IPointerLikeAddress implementations so autodiff can propagate DifferentialPtrPair through address offset computation, and (2) simplifies ILayer.eval from eval(input, weightAddr, biasAddr) to eval(input, parameterAddr), moving the bias address computation inside the differentiable function. This is a clean design improvement — the old API forced callers to split addresses outside the differentiable function; the new API keeps address arithmetic inside the differentiable scope where it belongs.

What looks good

Custom derivative pattern is correct. The fwd_getOffset implementations follow the established pattern from tests/autodiff/diff-ptr-type-custom.slang — constructing DifferentialPtrPair<This> by calling getOffset(elements) on both .p and .d.
no_diff() usage is consistent. Applied to PointerAddress and Ptr extension (which do pointer arithmetic the compiler can't differentiate), correctly omitted from BindlessAddress (uint arithmetic) and TorchTensorViewAddress (body bypassed by custom derivative).
Bias offset calculation is correct. OutputVector.Size * InputVector.Size matches the LinearLayout weight storage convention (weights are Out × In scalars at offset 0, bias follows immediately).
The Ptr extension's fwd_getOffset uses self.p + elements instead of .getOffset(elements), avoiding potential infinite recursion through the derivative chain. Good choice.
New backward test is well-designed. The 2-layer FFN with getOffset(0) and getOffset(6) inside the differentiable function directly exercises the custom derivative with non-trivial offsets, and verifies all 9 gradient values.
Test updates are thorough. All existing call sites are updated to the new 2-param API.

Issues

[Minor] Missing [require(cuda)] on TorchTensorViewAddress.fwd_getOffset — see inline comment. All other methods on this type have the attribute.
[Question] IPointerLikeAddress.getOffset interface declaration lacks [Differentiable] — see inline comment on istorages.slang. This works today due to [ForceInline] on FFLayer.eval causing specialization before autodiff, but may be fragile for future non-inlined generic usage.

Observations (non-blocking)

The new backward test only exercises BindlessAddress. PointerAddress coverage comes indirectly from the updated fflayer-autodiff-backward-test.slang (which has TEST_POINTER=1). The Ptr extension derivative has no runtime test coverage (tests are disabled due to tracked issues #8630, #8631, #8834), which is understandable.
The (compute, vulkan) category tag on metal/cuda test lines in the new backward test follows the existing convention in the neural test directory — not a copy-paste issue.

Overall this is a well-structured change that both enables a needed autodiff capability and simplifies the public API. The breaking label is appropriate.

github-actions · 2026-03-11T04:20:49Z

source/standard-modules/neural/bindless-storage.slang

+
    [ForceInline]
    [require(cuda_glsl_hlsl_metal_spirv, sm_6_6)]
    public void atomicAdd(uint index, T value)


nit: Missing [require(cuda)] on fwd_getOffset.

The primal getOffset on TorchTensorViewAddress (and the __init, __subscript) are all marked [require(cuda)]. The new fwd_getOffset should be consistent:

Suggested change

public void atomicAdd(uint index, T value)

[require(cuda)]

static DifferentialPtrPair<This> fwd_getOffset(DifferentialPtrPair<This> self, int elements)

Without it, a non-CUDA target could theoretically resolve this derivative function even though the primal is CUDA-only, producing a confusing error instead of a clean capability mismatch.

github-actions · 2026-03-11T04:20:51Z

tests/neural/basic-ilayer-ffn-backward-test.slang

+//
+//TEST(compute, vulkan):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-vk -compute -shaderobj -xslang -experimental-feature -output-using-type -emit-spirv-directly
+//TEST(compute, vulkan):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-mtl -compute -shaderobj -output-using-type -xslang -experimental-feature
+//TEST(compute, vulkan):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-cuda -compute -shaderobj -output-using-type -capability cuda_sm_7_0 -xslang -experimental-feature


suggestion: Consider adding a PointerAddress test variant as well.

This backward test only exercises BindlessAddress. The existing fflayer-autodiff-backward-test.slang does cover PointerAddress (via TEST_POINTER=1), and that test is updated in this PR to call getOffset inside the differentiable function. However, since this new test is the primary one that exercises multi-layer getOffset inside a differentiable context (with non-trivial offset=6), a TEST_POINTER=1 CUDA variant here would strengthen coverage of PointerAddress.fwd_getOffset in the multi-layer scenario.

Not blocking — the existing backward test suite provides reasonable coverage.

kaizhangNV requested a review from a team as a code owner March 11, 2026 03:55

kaizhangNV requested review from bmillsNV and Copilot and removed request for a team March 11, 2026 03:55

Copilot started reviewing on behalf of kaizhangNV March 11, 2026 03:55 View session

kaizhangNV added the pr: breaking change PRs with breaking changes label Mar 11, 2026

Copilot AI reviewed Mar 11, 2026

View reviewed changes

This comment was marked as outdated.

Sign in to view

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

fix build failure

a9ef926

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

github-actions bot reviewed Mar 11, 2026

View reviewed changes

		@@ -1,4 +1,4 @@
		// Test FFLayer eval() with separate weight/bias address instances.
		// Test FFLayer eval() with contiguous parameter storage.

-// Test FFLayer eval() with contiguous parameter storage.
+// Test FFLayer eval() with contiguous parameter storage (note: file name
+// `fflayer-two-storage-forward-test.slang` is legacy and now refers to this
+// contiguous-parameter layout test).

	public void atomicAdd(uint index, T value)
	[require(cuda)]
	static DifferentialPtrPair<This> fwd_getOffset(DifferentialPtrPair<This> self, int elements)

Conversation

kaizhangNV commented Mar 11, 2026

Uh oh!

coderabbitai bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

PR Review: Add custom derivatives for getOffset

Summary

What looks good

Issues

Observations (non-blocking)

Uh oh!

github-actions bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 11, 2026 •

edited

Loading