Fix KeyError for args_tensor_mask removed in PyTorch nightly by IvanYashchuk · Pull Request #2804 · Lightning-AI/lightning-thunder

IvanYashchuk · 2025-12-15T12:28:07Z

A recent PyTorch PR #166788 changed the autograd_function_apply higher-order op API by removing the args_tensor_mask keyword argument.

Currently, Thunder reads this key in the fwd_kwargs dict leading to a KeyError:

lightning-thunder/thunder/core/jit_ext.py

Line 943 in fb989d4

args_tensor_mask = unwrap(fwd_kwargs["args_tensor_mask"])

This PR adds a compatibility layer for both older PyTorch releases (which still provide args_tensor_mask) and newer PyTorch nightly versions (which removed it). When the key is not present, we infer the tensor mask by checking which arguments are TensorProxy instances.

Let's see in CI if further changes needed anywhere else.

Fixes #2803.

Copilot

Pull request overview

This PR adds a compatibility layer to handle the removal of the args_tensor_mask keyword argument from PyTorch's autograd_function_apply higher-order op API (removed in PyTorch PR #166788). The fix ensures Thunder works with both older PyTorch versions (that provide args_tensor_mask) and newer nightly versions (that removed it).

Key changes:

Added conditional logic to check if args_tensor_mask exists in fwd_kwargs before accessing it
When not present, infers the tensor mask by checking which arguments are TensorProxy instances

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-15T12:30:43Z

thunder/core/jit_ext.py

+    # NOTE: `args_tensor_mask` was removed in PyTorch PR #166788.
+    # https://github.com/pytorch/pytorch/pull/166788/
+    # For backwards compatibility with older PyTorch versions, we check if it exists.
+    # When not present, we assume all fwd_args are tensors.


The comment states "When not present, we assume all fwd_args are tensors" but the implementation actually checks each argument to determine if it's a TensorProxy. The comment should be updated to accurately reflect what the code does: "When not present, we infer the tensor mask by checking if each argument is a TensorProxy".

Suggested change

# When not present, we assume all fwd_args are tensors.

# When not present, we infer the tensor mask by checking if each argument is a TensorProxy.

IvanYashchuk · 2025-12-15T14:31:08Z

Closing in favor of #2802.

Fix KeyError for args_tensor_mask removed in PyTorch PR #166788

700ba8a

Copilot AI review requested due to automatic review settings December 15, 2025 12:28

IvanYashchuk requested review from KaelanDt, lantiga and mruberry as code owners December 15, 2025 12:28

IvanYashchuk added autograd jit labels Dec 15, 2025

Copilot started reviewing on behalf of IvanYashchuk December 15, 2025 12:28 View session

Copilot AI reviewed Dec 15, 2025

View reviewed changes

IvanYashchuk closed this Dec 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix KeyError for args_tensor_mask removed in PyTorch nightly#2804

Fix KeyError for args_tensor_mask removed in PyTorch nightly#2804
IvanYashchuk wants to merge 1 commit intomainfrom
fix-2803

IvanYashchuk commented Dec 15, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 15, 2025

Uh oh!

IvanYashchuk commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

	# When not present, we assume all fwd_args are tensors.
	# When not present, we infer the tensor mask by checking if each argument is a TensorProxy.

Conversation

IvanYashchuk commented Dec 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments