Skip to content

Conversation

@wwwind
Copy link
Collaborator

@wwwind wwwind commented Jul 10, 2025

Add dump_delegate_data function

Signed-off-by: Elena Zhelezina [email protected]

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Change-Id: I0b7adcc7a754bb5dd825435104f9ba54f0367222

Signed-off-by: Elena Zhelezina <[email protected]>
Change-Id: Idd3821e5a987c8ef08ffae9e506afce23b3aa3b8
@pytorch-bot
Copy link

pytorch-bot bot commented Jul 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12334

Note: Links to docs will display an error until the docs builds have been completed.

❌ 10 New Failures, 2 Cancelled Jobs, 1 Unrelated Failure

As of commit 74b59af with merge base bf4f0a3 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 10, 2025
@wwwind wwwind added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk release notes: arm Changes to the ARM backend delegate labels Jul 10, 2025
@wwwind wwwind requested a review from zingo July 10, 2025 13:38

def dump_delegate_data( # noqa: C901
self,
path: str,
Copy link
Contributor

@JacobSzwejbka JacobSzwejbka Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you take a TextIO instead? Like Dump_et_program above

edit: Or just any stream really since the data isnt text

) -> None:
"""
Dumps the delegate blob out of backend_delegate_data to <path><extension>.
Must have been created with extract_delegate_segments=True.
Copy link
Contributor

@JacobSzwejbka JacobSzwejbka Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not find the blob if its embedded in the flatbuffer section?

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wwwind for the PR.

Is the goal for extracting the blobs is to run a delegate subgraph on a TOSA reference model? If yes, then I don't think I prefer this approach of extracting the blobs and running them.

The main reason is e2e validation. You can have N blobs which will lead to N tosa_ref_sim calls and then you might have to stitch them together manually to see if the entire network worked correctly or not.

Alternatively I propose we can write a tosa runtime which works with tosa partitioner and graph breaks can be handled by portable libs seamlessly. This way you can re-use all the ET infra and have e2e working TOSA PTE :)

@wwwind
Copy link
Collaborator Author

wwwind commented Jul 11, 2025

Thank you for the review @digantdesai

Actually, this function is for cases when we don't need e2e flow. We have a graphic use case when we get these blobs as .tosa files and then we have a plugin which implements different bits around it in shaders and pass some data to run subgraphs.

@digantdesai
Copy link
Contributor

digantdesai commented Jul 14, 2025

Thank you for the review @digantdesai

Actually, this function is for cases when we don't need e2e flow. We have a graphic use case when we get these blobs as .tosa files and then we have a plugin which implements different bits around it in shaders and pass some data to run subgraphs.

Ok. Curious why can't we dump them after generation from preprocess? Extracting feels a bit like going backwards.

@wwwind
Copy link
Collaborator Author

wwwind commented Jul 14, 2025

I leave this to @per to answer as I am not sure how this could be done after preprocess

@per
Copy link
Collaborator

per commented Jul 31, 2025

Ok. Curious why can't we dump them after generation from preprocess? Extracting feels a bit like going backwards.

Agree, it is a bit like going backwards, but with a dedicated API intstead of using different flags for the backend to dump the delegate data. For the Arm backends it will be usuable for gettning the delegate data out of the .pte for all our backends, while the primary driver is the VgfBackend and internal testing scenarios.
Gold would be to also have some dump method (akin to objdump) on the serialized .pte files (maybe a executorch.runtime.Program method) to be able to extract the delegate data for a program serialized to file.

@digantdesai
Copy link
Contributor

Ok. Curious why can't we dump them after generation from preprocess? Extracting feels a bit like going backwards.

Agree, it is a bit like going backwards, but with a dedicated API intstead of using different flags for the backend to dump the delegate data. For the Arm backends it will be usuable for gettning the delegate data out of the .pte for all our backends, while the primary driver is the VgfBackend and internal testing scenarios. Gold would be to also have some dump method (akin to objdump) on the serialized .pte files (maybe a executorch.runtime.Program method) to be able to extract the delegate data for a program serialized to file.

Ok I don't have strong opinions, and don't want to block this. @JacobSzwejbka would be more appropriate for the review.

That said, thinking out loud why VGFPartitioner(dump_blobs=True) wouldn't be a better interface? Let me think some more about your comment for delegate agnostic AoT API.

The reason I am hesitant with the current approach is that (1) it doesn't let you avoid the AoT PTE generation flow, hence the opinion about the AoT API, (2) pte_objdump is being considered however that is more FC/BC tooling maintenance around the PTE as a file format, esp if we are expecting users to go through AoT flow anyway. And will get more complicated with NamedDataMap etc. features to share tensors across delegates.

@digantdesai
Copy link
Contributor

Any update on this?

@wwwind
Copy link
Collaborator Author

wwwind commented Aug 20, 2025

@digantdesai @per What is the final decision on this API ? Do we go forward with it or should I abandon this PR ?

@github-actions
Copy link

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the stale PRs inactive for over 60 days label Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate stale PRs inactive for over 60 days

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants