Skip to content

Add custom_op executor to the default executors to avoid hidden registration through thunder.torch.custom_op._register_custom_op#2714

Merged
t-vi merged 6 commits intomainfrom
copilot/fix-58386951-773894671-a6547d3b-c587-4286-9dfd-b11bf9ce30d7
Nov 11, 2025
Merged

Add custom_op executor to the default executors to avoid hidden registration through thunder.torch.custom_op._register_custom_op#2714
t-vi merged 6 commits intomainfrom
copilot/fix-58386951-773894671-a6547d3b-c587-4286-9dfd-b11bf9ce30d7

Conversation

Copy link
Contributor

Copilot AI commented Nov 3, 2025

As per title. Currently thunder.torch.custom_op._register_custom_op adds custom_op_ex to the default executor list.

I chose this way because I found it not great to see an executor in the default even when that executor does nothing. But now I'd find it worse to have the hidden behavior than having an empty executor in the default list. Also this hidden automatic registration might've caused some test failures of #2710

Copilot AI changed the title [WIP] custom_op executor AssertionError when running different tests together Fix custom_op executor persisting in default executors after deregistration Nov 3, 2025
Copilot AI requested a review from crcrpar November 3, 2025 16:57
Copilot AI requested a review from crcrpar November 3, 2025 17:04
@crcrpar
Copy link
Collaborator

crcrpar commented Nov 5, 2025

@Copilot could you change how we add custom_op_ex to the default executors? I mean, change that so that custom_op_ex is always included in the default executors even when _register_custom_op is not used at all.

Ref:

# NOTE: `thunder.extend.add_default_executor` basically does `lst.insert(ex, 0)`.
if custom_op_ex not in get_default_executors():
default_executors = get_default_executors()
new_default_executors = add_executor_lists(default_executors, [custom_op_ex])
set_default_executors(new_default_executors)

@crcrpar crcrpar changed the title Fix custom_op executor persisting in default executors after deregistration Add custom_op executor to the default executors to avoid hidden registration through thunder.torch.custom_op._register_custom_op Nov 6, 2025
@crcrpar crcrpar marked this pull request as ready for review November 10, 2025 10:57
Copilot AI and others added 5 commits November 10, 2025 02:57
…ation

When _deregister_custom_op is called, it now checks if there are no more
custom ops registered and removes the custom_op_ex executor from the
default executors list. This prevents the executor from persisting across
test runs when no custom ops are actually registered.

Co-authored-by: crcrpar <16191443+crcrpar@users.noreply.github.com>
Add test_custom_op_executor_cleanup to ensure that the custom_op executor
is properly removed from default executors when all custom ops are
deregistered. This prevents the bug from recurring where the executor would
persist across test runs.

Co-authored-by: crcrpar <16191443+crcrpar@users.noreply.github.com>
- Simplified custom_op_ex removal logic by removing unnecessary check
- Removed placeholder issue URL from test docstring

Co-authored-by: crcrpar <16191443+crcrpar@users.noreply.github.com>
Added detailed explanation in test docstring showing how the test directly
addresses the original issue where test_recipes.py would fail after
test_torch_library_custom_op.py due to custom_op remaining in
get_expected_executors(). The test simulates the exact sequence that was
failing and verifies the fix prevents the executor from persisting.

Co-authored-by: crcrpar <16191443+crcrpar@users.noreply.github.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
@crcrpar crcrpar force-pushed the copilot/fix-58386951-773894671-a6547d3b-c587-4286-9dfd-b11bf9ce30d7 branch from 9c10f0a to b0e9af1 Compare November 10, 2025 10:57
@kiya00
Copy link
Collaborator

kiya00 commented Nov 10, 2025

it seems the Lit Job has error:

=========================== short test summary info ============================
FAILED thunder/tests/test_recipes.py::test_plugins_composition - AssertionError: assert 'custom_op' in ['__ad_hoc_executor_129236491690128', 'transformer_engine_v1', 'nvfuser']
 +  where 'custom_op' = thunder.extend.OperatorExecutor('custom_op').name
FAILED thunder/tests/test_recipes.py::test_plugins_basics - AssertionError: assert 'custom_op' in ['__ad_hoc_executor_129236439815024', '__ad_hoc_executor_129236439825584', 'nvfuser', 'torch', 'python']
 +  where 'custom_op' = thunder.extend.OperatorExecutor('custom_op').name
= 2 failed, 2212 passed, 207 skipped, 21 xfailed, 7 xpassed, 46774 warnings in 879.83s (0:14:39) =

is it related?

Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Copy link
Collaborator

@kshitij12345 kshitij12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @crcrpar

NOTE to reviewers: Also tested with

def test_get_all_executors_includes_all_native_executors():
executors = get_all_executors()
actual = {e.name for e in executors}
# apex and transformer_engine register the executor even if the external library they rely on is not available.
expected = {
"apex",
"custom_op",

Copy link
Collaborator

@kiya00 kiya00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Copy link
Collaborator

@t-vi t-vi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@t-vi t-vi merged commit 53c3d23 into main Nov 11, 2025
52 checks passed
@t-vi t-vi deleted the copilot/fix-58386951-773894671-a6547d3b-c587-4286-9dfd-b11bf9ce30d7 branch November 11, 2025 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants