Skip to content

Conversation

@per
Copy link
Collaborator

@per per commented Oct 25, 2024

Split up ArmBackend into separate TOSA and Ethos-U backends.

Signed-off-by: Per Åstrand [email protected]

Change-Id: I0f03442a0e161c97699fbbbe7dab1731289c8ffa

cc @digantdesai @freddan80 @zingo @oscarandersson8218

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 25, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6495

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 33b90f1 with merge base 8ad15f3 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 25, 2024
@per per added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk labels Oct 25, 2024
@per per force-pushed the arm_backend_refactoring branch from 80c50b0 to bc061ed Compare October 28, 2024 07:24
@robell robell requested a review from digantdesai October 28, 2024 11:57
@robell
Copy link
Collaborator

robell commented Oct 28, 2024

What's the view on removing the ArmBackend and having ArmEthosUBackend or ArmTOSABackend - we only really test the ArmPartitioner path right now so the fact we can no longer call a generic ArmBackend (as it has to be final) isn't a particular problem for our expected use. But I'd like to understand that from the ExecuTorch interface perspective. I suppose it's not a problem for Partitioners and Backends not to be 1:1?

@per per force-pushed the arm_backend_refactoring branch from bc061ed to 4c71b47 Compare October 29, 2024 07:50
class ArmPartitioner(Partitioner):
def __init__(self, compile_spec: List[CompileSpec]) -> None:
self.delegation_spec = DelegationSpec(ArmBackend.__name__, compile_spec)
self.delegation_spec = ArmBackendSelector.get_delegation_spec(compile_spec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, that said what is the rationale for keeping the partitioner class shared?

In my mind, quantizer, partitioner, and backend go as a "group". I.e. there is an inherent contract between these from what the target is capable of, from ET design point of view. But we never had a TOSA like IR before (at least not as open).

If sharing code is the primary motivation given TOSA IR, does it make sense to have different quantizer or partitioner as well as a wrapper, while keeping the code shared underneath? Purely from UX point of view. I.e. I just care about Ethos so I will do,

Ethos quantizer -> Ethos Parititoner -> Ethos backend -> .. -> Ethos runtime

That said I also like this UX where we have,

TOSA_complient_generic_quantizer -> TOSA_generic_artitioner_with_flags -> TOSA_complient_user_specified_backend -> .. -> Backend Runtime

We should try to avoid making quantizer aware of backend specific details by passing in some flags, because then it will be a clunky UX. If we find a need for that then we should just have wrappers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, that said what is the rationale for keeping the partitioner class shared?

The idea is to keep the ArmPartitioner as a generic entrypoint for TOSA backends. Both to share code and to separate the Ethos-U backend from the TOSA backend. But not using the TOSA name since that name maybe isn't super clear to users.

We should try to avoid making quantizer aware of backend specific details by passing in some flags, because then it will be a clunky UX. If we find a need for that then we should just have wrappers.

I do think the quantizer also needs to have some flags to know how to quantize according to the TOSA specification the hardware supports. What is it that you think get's clunky passing it in to a generic TOSA/Arm quantizer?
Not as part of this PR, so let's take that discussion of line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From, UX side I think it is much cleaner if you have a different front ends for each TOSA compliant hardware backend i.e. using ethos and vulkan as proxy tosa backends for an example,

Option-A

q = VulkanQuantizer()
...
em = to_backend(VulkanPartitioner(vulkan_flag=1))
...
vulkan_runtime.c

--- different user ---

q = EthosQuantizer()
...
em = to_backend(EthosPartitioner(ethos_flag=1))
...
ethos_runtime.c

Option-B

q = ArmQuantizer(flag=Vulkan)
...
em = to_backend(ArmPartitioner(flag=Vulkan, vulkan_flag=1))
...
vulkan_runtime.c

--- different user ---

q = ArmQuantizer(flag=Ethos)
...
em = to_backend(ArmPartitioner(flag=Ethos, ethos_flag=1))
...
ethos_runtime.c

Purely from UX point of view I like Option-A, I see that as namespaces or inheritance of classes. TOSA is the backbone, which we want to hide from people (per your comment, which I agree) can be behind the scene as an implementation detail.

Also Option-A allows you to do more vulkan or ethos specific things without messing with the other i.e. if you want to add vela specific flags, it just a noise for Vulkan users.
Moreover, the overlap between the set of users is not much i.e. there is no reason for Ethos users to be "aware" of Vulkan specific things or even the existence of Vulkan for that matter.

On the flip side, if Quantizer and Partitioners are both targeting TOSA spec then you can argue that there might not be much specific to different TOSA backend. I agree to an extend, however there will always be things like hardware or runtime related creeps in.

@per per marked this pull request as draft November 11, 2024 15:03
@per per force-pushed the arm_backend_refactoring branch from 4c71b47 to e6a7d39 Compare February 7, 2025 10:13
@per per added the release notes: arm Changes to the ARM backend delegate label Feb 7, 2025
@per per requested a review from digantdesai February 7, 2025 10:27
@per per marked this pull request as ready for review February 7, 2025 10:27
@per per force-pushed the arm_backend_refactoring branch 3 times, most recently from 952253d to 4ebd08a Compare February 11, 2025 21:14
Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I feel this is much cleaner, and scalable. Thank you @per.

Left some nit comments, but stamping to unblock you from moving forward.


typedef struct {
FreeableBuffer* processed;
} ExecutionHandle;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we can have a runtime wrapping tosa reference model too, which works on host/arm,x86? This is if we want an e2e set up for TOSA-only flow too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! And also the possibility to 'recompile' a TOSA backend .pte as long as you use the same TosaSpecification for the recompile.



class ArmEthosUQuantizer(ArmTOSAQuantizer):
def __init__(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should take TosaSpecification?

I guess this can be somewhat of a sticking point, i.e partitioner gets the compile_spec to decide u55 vs. u85, now we need to tell quantizer too (if/when necessary). This is something we can document, and cross check for now through some metadata insertion in the graph.

Copy link
Collaborator Author

@per per Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should!
What about rework this a bit to have the compile spec as the common argument to both partitioner and quantizer (for EthosU)?
The compile spec builder probably should also be split up to respective backends, but I'll hold of on that.

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I feel this is much cleaner, and scalable. Thank you @per.

Left some nit comments, but stamping to unblock you from moving forward.

@digantdesai
Copy link
Contributor

cc @mergennachin - FYI

@per per force-pushed the arm_backend_refactoring branch from 4ebd08a to e95cba4 Compare February 13, 2025 13:06
Split up ArmBackend into separate TOSA and Ethos-U classes.
Move partitioner and backend to separate files and remove Arm prefix.
Add compile_spec argument to EthosUQuantizer

Signed-off-by: Per Åstrand <[email protected]>
Change-Id: I0f03442a0e161c97699fbbbe7dab1731289c8ffa
@per per force-pushed the arm_backend_refactoring branch from e95cba4 to 33b90f1 Compare February 13, 2025 13:11
@per
Copy link
Collaborator Author

per commented Feb 13, 2025

Failing on MacOS with HTTP error 520 from https://conda.anaconda.org/.

@per per requested a review from zingo February 13, 2025 13:25
@zingo zingo merged commit f8c345b into pytorch:main Feb 13, 2025
107 of 109 checks passed
@per per deleted the arm_backend_refactoring branch February 17, 2025 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants