Skip to content

Add custom derivatives for getOffset#10499

Open
kaizhangNV wants to merge 2 commits intomasterfrom
autodiff_getoffset
Open

Add custom derivatives for getOffset#10499
kaizhangNV wants to merge 2 commits intomasterfrom
autodiff_getoffset

Conversation

@kaizhangNV
Copy link
Contributor

Add custom derivatives for getOffset and simplify ILayer.eval to single parameter address

  • Add [Differentiable] and [ForwardDerivative(fwd_getOffset)] to getOffset on all IPointerLikeAddress implementations (BindlessAddress, PointerAddress, TorchTensorViewAddress, Ptr extension) so autodiff can propagate DifferentialPtrPair through address offset computation.

  • Simplify ILayer.eval from eval(input, weightAddr, biasAddr) to eval(input, parameterAddr). FFLayer.eval now internally computes the bias address via parameterAddr.getOffset(weightCount), keeping address arithmetic inside the differentiable function.

  • Rename basic-ilayer-ffn-training-test to basic-ilayer-ffn-forward-test and add basic-ilayer-ffn-backward-test that verifies gradients through a 2-layer FFN with getOffset called inside the differentiable function.

  • Update all test call sites to use the new single-address eval API.

…le parameter address

- Add [Differentiable] and [ForwardDerivative(fwd_getOffset)] to getOffset
  on all IPointerLikeAddress implementations (BindlessAddress, PointerAddress,
  TorchTensorViewAddress, Ptr extension) so autodiff can propagate
  DifferentialPtrPair through address offset computation.

- Simplify ILayer.eval from eval(input, weightAddr, biasAddr) to
  eval(input, parameterAddr). FFLayer.eval now internally computes
  the bias address via parameterAddr.getOffset(weightCount), keeping
  address arithmetic inside the differentiable function.

- Rename basic-ilayer-ffn-training-test to basic-ilayer-ffn-forward-test
  and add basic-ilayer-ffn-backward-test that verifies gradients through
  a 2-layer FFN with getOffset called inside the differentiable function.

- Update all test call sites to use the new single-address eval API.
@kaizhangNV kaizhangNV requested a review from a team as a code owner March 11, 2026 03:55
@kaizhangNV kaizhangNV requested review from bmillsNV and Copilot and removed request for a team March 11, 2026 03:55
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

Consolidates weights and biases into a single contiguous parameter address for layers and adds forward-mode differentiable support for getOffset across multiple pointer-like address types, via @Differentiable and ForwardDerivative(fwd_getOffset) plus static fwd_getOffset helpers.

Changes

Cohort / File(s) Summary
Differentiability Support
source/standard-modules/neural/bindless-storage.slang
Annotates getOffset with @Differentiable and ForwardDerivative(fwd_getOffset) for BindlessAddress, PointerAddress, TorchTensorViewAddress, and the internal Ptr extension; adds static fwd_getOffset(DifferentialPtrPair<This>, int) implementations that propagate primal and differential components via DifferentialPtrPair.
Core Interface & Implementations
source/standard-modules/neural/ilayer.slang, source/standard-modules/neural/layers.slang
Changes ILayer.eval and FFLayer.eval signatures from (weightAddr, Optional biasAddr) to a single parameterAddr; internal bias offset is computed from the base parameter block.
New/Updated Tests — ILayer FFN Backward
tests/neural/basic-ilayer-ffn-backward-test.slang
Adds a new autodiff backward unit test for a two-layer ILayer FFN that exercises differentiable getOffset in gradient propagation; sets up parameter buffers, runs backward pass, and asserts expected parameter gradients.
Updated Tests — Forward & Frontend Smoke
tests/neural/basic-ilayer-ffn-forward-test.slang, tests/neural/basic-ilayer-frontend-smoke-test.slang
Refactors tests to pass a single per-layer base address to eval instead of separate weight/bias addresses; updates helper signatures (e.g., evalAsLayer) and call sites accordingly.
Updated Tests — FFLayer Autodiff Variants
tests/neural/fflayer-autodiff-backward-test.slang, tests/neural/fflayer-no-bias-test.slang, tests/neural/fflayer-wavetangled-vector-test.slang
Unifies autodiff wrappers to accept single paramAddr; simplifies backward paths to use a single DifferentialPtrPair for params so getOffset derivatives participate in autodiff.
Storage & Unit Test Adjustments
tests/neural/fflayer-two-storage-forward-test.slang, tools/gfx-unit-test/neural-tensorview-address.slang
Consolidates weight/bias storage into a single params uniform and switches address types (e.g., to BindlessAddress<float>); updates signatures and callers to use unified parameter addressing.
Activation Test Update
tests/neural/activation-with-fflayer-test.slang
Updates calls to layer.eval to pass a single base parameter address and removes per-parameter address computations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A little hop through offsets and code,
One block of params down the new road,
Derivatives carried, forward they prance,
Pointers pair up and join the dance,
Hooray — a rabbit's happy code-gladance! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the primary change: adding custom derivatives (via @Differentiable and @ForwardDerivative attributes) for the getOffset method across address types.
Description check ✅ Passed The description is directly related to the changeset, detailing the derivative annotations, API simplification, test updates, and the overall objective of enabling autodiff through address offset computation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch autodiff_getoffset

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the neural layer parameter addressing model to support autodiff through address arithmetic, by making getOffset differentiable and switching ILayer/FFLayer.eval to accept a single base parameter address for a contiguous (weights-then-bias) parameter block.

Changes:

  • Add custom forward derivatives for getOffset across IPointerLikeAddress implementations so autodiff can propagate DifferentialPtrPair through pointer offsets.
  • Simplify ILayer.eval / FFLayer.eval to eval(input, parameterAddr) and compute bias address internally via parameterAddr.getOffset(weightCount).
  • Update/extend neural tests to use the new API and add a backward test that exercises getOffset inside the differentiable function.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tools/gfx-unit-test/neural-tensorview-address.slang Switch unit test to single-parameter-address eval and route grads through a single DifferentialPtrPair.
tests/neural/fflayer-wavetangled-vector-test.slang Update WaveTangledVector FFLayer tests to the new eval(input, baseAddr) signature.
tests/neural/fflayer-two-storage-forward-test.slang Convert test to contiguous parameter storage and update call sites to the new API.
tests/neural/fflayer-no-bias-test.slang Update no-bias layer tests/wrappers to pass a base parameter address.
tests/neural/fflayer-autodiff-backward-test.slang Update autodiff backward test to exercise getOffset inside eval.
tests/neural/basic-ilayer-frontend-smoke-test.slang Update ILayer-constrained call site to new single-address eval.
tests/neural/basic-ilayer-ffn-forward-test.slang Update 2-layer FFN forward test to use per-layer base offsets.
tests/neural/basic-ilayer-ffn-backward-test.slang New backward test verifying gradients through a 2-layer FFN with internal getOffset.
tests/neural/activation-with-fflayer-test.slang Update activation + FFLayer tests to pass only the base address.
source/standard-modules/neural/layers.slang Change FFLayer.eval signature and compute bias address internally via getOffset.
source/standard-modules/neural/ilayer.slang Change ILayer.eval signature to accept a single contiguous-parameter base address.
source/standard-modules/neural/bindless-storage.slang Add [Differentiable] + [ForwardDerivative] for getOffset on address types.

Comment on lines +259 to +264
static DifferentialPtrPair<This> fwd_getOffset(DifferentialPtrPair<This> self, int elements)
{
return DifferentialPtrPair<This>(
self.p.getOffset(elements),
self.d.getOffset(elements));
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TorchTensorViewAddress.getOffset is marked [require(cuda)], but its forward derivative helper fwd_getOffset is not. On non-CUDA targets this helper may still be type-checked/compiled and it calls getOffset, which is CUDA-only, potentially causing compilation errors. Mark fwd_getOffset with the same [require(cuda)] (or otherwise ensure it is excluded on non-CUDA targets).

Copilot uses AI. Check for mistakes.
@@ -1,4 +1,4 @@
// Test FFLayer eval() with separate weight/bias address instances.
// Test FFLayer eval() with contiguous parameter storage.
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test now verifies contiguous parameter storage rather than separate weight/bias storage, but the file name (fflayer-two-storage-forward-test.slang) still suggests the old behavior. Consider renaming the file to match the updated intent to keep the test suite easier to navigate.

Suggested change
// Test FFLayer eval() with contiguous parameter storage.
// Test FFLayer eval() with contiguous parameter storage (note: file name
// `fflayer-two-storage-forward-test.slang` is legacy and now refers to this
// contiguous-parameter layout test).

Copilot uses AI. Check for mistakes.
github-actions[bot]

This comment was marked as outdated.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/neural/basic-ilayer-ffn-forward-test.slang (1)

49-54: Prefer deriving the second layer base from Layer1.ParameterCount.

Line 51 hardcodes the current packing size as 6. Using the layer constant keeps this test aligned if the parameter layout changes.

♻️ Suggested cleanup
     // Compute base addresses for each layer's parameter block
     let layer1Addr = baseAddr.getOffset(0);
-    let layer2Addr = baseAddr.getOffset(6);
+    let layer2Addr = baseAddr.getOffset(Layer1.ParameterCount);

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 804adb37-6a5c-42a1-9b9c-c186c5912b1d

📥 Commits

Reviewing files that changed from the base of the PR and between 38b5b83 and e5cfe4d.

📒 Files selected for processing (12)
  • source/standard-modules/neural/bindless-storage.slang
  • source/standard-modules/neural/ilayer.slang
  • source/standard-modules/neural/layers.slang
  • tests/neural/activation-with-fflayer-test.slang
  • tests/neural/basic-ilayer-ffn-backward-test.slang
  • tests/neural/basic-ilayer-ffn-forward-test.slang
  • tests/neural/basic-ilayer-frontend-smoke-test.slang
  • tests/neural/fflayer-autodiff-backward-test.slang
  • tests/neural/fflayer-no-bias-test.slang
  • tests/neural/fflayer-two-storage-forward-test.slang
  • tests/neural/fflayer-wavetangled-vector-test.slang
  • tools/gfx-unit-test/neural-tensorview-address.slang

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
source/standard-modules/neural/bindless-storage.slang (1)

82-89: ⚠️ Potential issue | 🟠 Major

Keep BindlessAddress.getOffset() signed until the final validation/cast.

Line 84 stores baseIndex + elements into uint, while the other getOffset() implementations keep elements in signed pointer arithmetic. Because IPointerLikeAddress.getOffset() takes int, this makes negative offsets behave differently for BindlessAddress, and fwd_getOffset() will propagate the same wrapped index into the differential address. Either reject negative offsets consistently at the interface boundary or do the arithmetic in a signed type before the final cast.

Possible direction
     public This getOffset(int elements)
     {
-        uint newBaseIndex = baseIndex + elements;
+        int newBaseIndex = int(baseIndex) + elements;
+        // Reject `newBaseIndex < 0` here if bindless addresses are not
+        // allowed to move backwards.
 
         This address = This(handle);
-        address.baseIndex = newBaseIndex;
+        address.baseIndex = uint(newBaseIndex);
         return address;
     }

Also applies to: 91-95


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c517abeb-f425-4fec-a6c0-9bb675d92729

📥 Commits

Reviewing files that changed from the base of the PR and between e5cfe4d and a9ef926.

📒 Files selected for processing (1)
  • source/standard-modules/neural/bindless-storage.slang

Comment on lines +80 to 82
[Differentiable]
[ForwardDerivative(fwd_getOffset)]
public This getOffset(int elements)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In current Slang autodiff documentation, can a method marked [Differentiable]legally return a type that implementsIDifferentiablePtrType, or are such types restricted to input parameters only?

💡 Result:

In Slang’s current autodiff docs, a function/method marked [Differentiable] may not return a type that implements IDifferentiablePtrType. Types implementing IDifferentiablePtrType are restricted to input (in) parameters only (i.e., they cannot be used as out, inout, or return types), and attempting to do so is a compile-time error. [1]


getOffset() methods cannot be marked [Differentiable] because they return IDifferentiablePtrType.

In Slang's autodiff system, types implementing IDifferentiablePtrType are restricted to input parameters only and cannot be returned from [Differentiable] methods—this is a compile-time error per the documented rules. The current annotations on these methods violate that restriction.

Also applies to: 141–143, 250–252, 330–332

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Add custom derivatives for getOffset

Summary

This PR makes two coupled changes: (1) adds [Differentiable] + [ForwardDerivative(fwd_getOffset)] to getOffset on all four IPointerLikeAddress implementations so autodiff can propagate DifferentialPtrPair through address offset computation, and (2) simplifies ILayer.eval from eval(input, weightAddr, biasAddr) to eval(input, parameterAddr), moving the bias address computation inside the differentiable function. This is a clean design improvement — the old API forced callers to split addresses outside the differentiable function; the new API keeps address arithmetic inside the differentiable scope where it belongs.

What looks good

  • Custom derivative pattern is correct. The fwd_getOffset implementations follow the established pattern from tests/autodiff/diff-ptr-type-custom.slang — constructing DifferentialPtrPair<This> by calling getOffset(elements) on both .p and .d.
  • no_diff() usage is consistent. Applied to PointerAddress and Ptr extension (which do pointer arithmetic the compiler can't differentiate), correctly omitted from BindlessAddress (uint arithmetic) and TorchTensorViewAddress (body bypassed by custom derivative).
  • Bias offset calculation is correct. OutputVector.Size * InputVector.Size matches the LinearLayout weight storage convention (weights are Out × In scalars at offset 0, bias follows immediately).
  • The Ptr extension's fwd_getOffset uses self.p + elements instead of .getOffset(elements), avoiding potential infinite recursion through the derivative chain. Good choice.
  • New backward test is well-designed. The 2-layer FFN with getOffset(0) and getOffset(6) inside the differentiable function directly exercises the custom derivative with non-trivial offsets, and verifies all 9 gradient values.
  • Test updates are thorough. All existing call sites are updated to the new 2-param API.

Issues

  1. [Minor] Missing [require(cuda)] on TorchTensorViewAddress.fwd_getOffset — see inline comment. All other methods on this type have the attribute.

  2. [Question] IPointerLikeAddress.getOffset interface declaration lacks [Differentiable] — see inline comment on istorages.slang. This works today due to [ForceInline] on FFLayer.eval causing specialization before autodiff, but may be fragile for future non-inlined generic usage.

Observations (non-blocking)

  • The new backward test only exercises BindlessAddress. PointerAddress coverage comes indirectly from the updated fflayer-autodiff-backward-test.slang (which has TEST_POINTER=1). The Ptr extension derivative has no runtime test coverage (tests are disabled due to tracked issues #8630, #8631, #8834), which is understandable.
  • The (compute, vulkan) category tag on metal/cuda test lines in the new backward test follows the existing convention in the neural test directory — not a copy-paste issue.

Overall this is a well-structured change that both enables a needed autodiff capability and simplifies the public API. The breaking label is appropriate.


[ForceInline]
[require(cuda_glsl_hlsl_metal_spirv, sm_6_6)]
public void atomicAdd(uint index, T value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Missing [require(cuda)] on fwd_getOffset.

The primal getOffset on TorchTensorViewAddress (and the __init, __subscript) are all marked [require(cuda)]. The new fwd_getOffset should be consistent:

Suggested change
public void atomicAdd(uint index, T value)
[require(cuda)]
static DifferentialPtrPair<This> fwd_getOffset(DifferentialPtrPair<This> self, int elements)

Without it, a non-CUDA target could theoretically resolve this derivative function even though the primal is CUDA-only, producing a confusing error instead of a clean capability mismatch.

//
//TEST(compute, vulkan):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-vk -compute -shaderobj -xslang -experimental-feature -output-using-type -emit-spirv-directly
//TEST(compute, vulkan):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-mtl -compute -shaderobj -output-using-type -xslang -experimental-feature
//TEST(compute, vulkan):COMPARE_COMPUTE_EX(filecheck-buffer=BUFFER):-cuda -compute -shaderobj -output-using-type -capability cuda_sm_7_0 -xslang -experimental-feature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider adding a PointerAddress test variant as well.

This backward test only exercises BindlessAddress. The existing fflayer-autodiff-backward-test.slang does cover PointerAddress (via TEST_POINTER=1), and that test is updated in this PR to call getOffset inside the differentiable function. However, since this new test is the primary one that exercises multi-layer getOffset inside a differentiable context (with non-trivial offset=6), a TEST_POINTER=1 CUDA variant here would strengthen coverage of PointerAddress.fwd_getOffset in the multi-layer scenario.

Not blocking — the existing backward test suite provides reasonable coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: breaking change PRs with breaking changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants