Add XNNPACK backend option for workspace sharing (re-land) #13934

GregoryComer · 2025-09-04T03:09:37Z

Summary:
Note: This is a re-land, fixing a use after free which occurred when destroying a delegate instance. The executor is destroyed, which frees the workspace. The mutex that raii_lock points to is owned by the workspace. There is then a use after free when raii_lock goes out of scope. This is fixed by taking an owning reference to the workspace in destroy.

Add a backend option for XNNPACK to enable runtime control of workspace sharing. I've added 3 mode options - Disabled, PerModel, and Global. PerModel shares the workspace between all CALL_DELEGATE instances in a model, keyed by memory allocator address (see below). Global uses a single workspace instance.

I've written the code to allow for the active workspace mode to be safely changed at any time. The workspace instance is resolved at delegate instance init time (model load) and is stored in the XNNExecutor instance. This design will also allow us to set per-model sharing options in the future. I've introduced a wrapper class (XNNWorkspace) to help with synchronization.

With regard to the PerModel behavior, I am using the address of the runtime allocator to disambiguate the model. This is not ideal in the long-run, but there is some larger discussion around generating IDs in a coherent manner in multithreaded environments without synchronization in the core runtime. This might require PAL changes (exposing a thread ID, for example), so I intend to come back to this.

It should be possible to transparently update this logic in the future. The program ID can collide or change without affecting correctness, but may increase memory (for collisions) or enforce extra synchronization (if unstable between delegate instances in a method).

I'd like to add a PerMethod mode as a follow-up. This should be keyed to the specific method instance (not name), such that multiple method instances for the same method can be loaded for execution on different threads without forcing synchronization, but still allow sharing between call delegate instances in each method instance. This will require a unique method identifier.

Differential Revision: D81647105

pytorch-bot · 2025-09-04T03:09:40Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13934

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit b1ce3e9 with merge base 07d1092 ():

NEW FAILURE - The following job has failed:

pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 2014338df3455dc0a2344bc068bfe013549c9f3125ad6613dc00f7a3fe507e27 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-09-04T03:09:49Z

This pull request was exported from Phabricator. Differential Revision: D81647105

GregoryComer · 2025-09-04T23:45:37Z

@digantdesai Any concerns with the changes I've made here?

digantdesai · 2025-09-08T16:06:24Z

backends/xnnpack/runtime/XNNPACKBackend.cpp

+      // shared_ptr, as the pointer in the executor is freed, which includes
+      // the mutex referenced by raii_lock.
+      auto workspace = executor->get_workspace();
+      auto [raii_lock, _] = workspace->acquire();


is this the new change w/ re-land?

Yeah, that's correct. It keeps a shared_ptr handle to the XnnWorkspace object alive during destruction (the mutex is inside this). I'm open to other ideas, as well. It's slightly awkward with the current design.

digantdesai

Alright, let's do it. Good luck!

…3934) Summary: **Note: This is a re-land, fixing a use after free which occurred when destroying a delegate instance. The executor is destroyed, which frees the workspace. The mutex that raii_lock points to is owned by the workspace. There is then a use after free when raii_lock goes out of scope. This is fixed by taking an owning reference to the workspace in destroy.** Add a backend option for XNNPACK to enable runtime control of workspace sharing. I've added 3 mode options - Disabled, PerModel, and Global. PerModel shares the workspace between all CALL_DELEGATE instances in a model, keyed by memory allocator address (see below). Global uses a single workspace instance. I've written the code to allow for the active workspace mode to be safely changed at any time. The workspace instance is resolved at delegate instance init time (model load) and is stored in the XNNExecutor instance. This design will also allow us to set per-model sharing options in the future. I've introduced a wrapper class (XNNWorkspace) to help with synchronization. With regard to the PerModel behavior, I am using the address of the runtime allocator to disambiguate the model. This is not ideal in the long-run, but there is some larger discussion around generating IDs in a coherent manner in multithreaded environments without synchronization in the core runtime. This might require PAL changes (exposing a thread ID, for example), so I intend to come back to this. It should be possible to transparently update this logic in the future. The program ID can collide or change without affecting correctness, but may increase memory (for collisions) or enforce extra synchronization (if unstable between delegate instances in a method). I'd like to add a PerMethod mode as a follow-up. This should be keyed to the specific method instance (not name), such that multiple method instances for the same method can be loaded for execution on different threads without forcing synchronization, but still allow sharing between call delegate instances in each method instance. This will require a unique method identifier. Reviewed By: digantdesai Differential Revision: D81647105

facebook-github-bot · 2025-09-11T22:38:27Z

@GregoryComer has exported this pull request. If you are a Meta employee, you can view the originating diff in D81647105.

…3934) Summary: **Note: This is a re-land, fixing a use after free which occurred when destroying a delegate instance. The executor is destroyed, which frees the workspace. The mutex that raii_lock points to is owned by the workspace. There is then a use after free when raii_lock goes out of scope. This is fixed by taking an owning reference to the workspace in destroy.** Add a backend option for XNNPACK to enable runtime control of workspace sharing. I've added 3 mode options - Disabled, PerModel, and Global. PerModel shares the workspace between all CALL_DELEGATE instances in a model, keyed by memory allocator address (see below). Global uses a single workspace instance. I've written the code to allow for the active workspace mode to be safely changed at any time. The workspace instance is resolved at delegate instance init time (model load) and is stored in the XNNExecutor instance. This design will also allow us to set per-model sharing options in the future. I've introduced a wrapper class (XNNWorkspace) to help with synchronization. With regard to the PerModel behavior, I am using the address of the runtime allocator to disambiguate the model. This is not ideal in the long-run, but there is some larger discussion around generating IDs in a coherent manner in multithreaded environments without synchronization in the core runtime. This might require PAL changes (exposing a thread ID, for example), so I intend to come back to this. It should be possible to transparently update this logic in the future. The program ID can collide or change without affecting correctness, but may increase memory (for collisions) or enforce extra synchronization (if unstable between delegate instances in a method). I'd like to add a PerMethod mode as a follow-up. This should be keyed to the specific method instance (not name), such that multiple method instances for the same method can be loaded for execution on different threads without forcing synchronization, but still allow sharing between call delegate instances in each method instance. This will require a unique method identifier. Reviewed By: digantdesai Differential Revision: D81647105

facebook-github-bot · 2025-09-19T22:48:53Z

@GregoryComer has exported this pull request. If you are a Meta employee, you can view the originating diff in D81647105.

Differential Revision: D81647105 Pull Request resolved: pytorch#13934

GregoryComer requested review from cccclai and digantdesai as code owners September 4, 2025 03:09

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 4, 2025

facebook-github-bot added the fb-exported label Sep 4, 2025

GregoryComer added the release notes: xnnpack Changes to the XNNPack backend delegate label Sep 4, 2025

digantdesai reviewed Sep 8, 2025

View reviewed changes

digantdesai approved these changes Sep 8, 2025

View reviewed changes

GregoryComer force-pushed the export-D81647105 branch from e27afff to 52ff3e2 Compare September 11, 2025 22:37

facebook-github-bot added the meta-exported label Sep 11, 2025

GregoryComer force-pushed the export-D81647105 branch from 52ff3e2 to b1ce3e9 Compare September 19, 2025 22:48

facebook-github-bot merged commit a523306 into pytorch:main Sep 20, 2025
125 of 128 checks passed

StrycekSimon pushed a commit to nxp-upstream/executorch that referenced this pull request Sep 23, 2025

Add XNNPACK backend option for workspace sharing (re-land)

a547a24

Differential Revision: D81647105 Pull Request resolved: pytorch#13934

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add XNNPACK backend option for workspace sharing (re-land) #13934

Add XNNPACK backend option for workspace sharing (re-land) #13934

Uh oh!

GregoryComer commented Sep 4, 2025

Uh oh!

pytorch-bot bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 4, 2025

Uh oh!

GregoryComer commented Sep 4, 2025

Uh oh!

digantdesai Sep 8, 2025

Uh oh!

GregoryComer Sep 8, 2025 •

edited

Loading

Uh oh!

digantdesai left a comment

Uh oh!

facebook-github-bot commented Sep 11, 2025

Uh oh!

facebook-github-bot commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add XNNPACK backend option for workspace sharing (re-land) #13934

Add XNNPACK backend option for workspace sharing (re-land) #13934

Uh oh!

Conversation

GregoryComer commented Sep 4, 2025

Uh oh!

pytorch-bot bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13934

❌ 1 New Failure

Uh oh!

facebook-github-bot commented Sep 4, 2025

Uh oh!

GregoryComer commented Sep 4, 2025

Uh oh!

digantdesai Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

GregoryComer Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

digantdesai left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 11, 2025

Uh oh!

facebook-github-bot commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Sep 4, 2025 •

edited

Loading

GregoryComer Sep 8, 2025 •

edited

Loading