-
Notifications
You must be signed in to change notification settings - Fork 737
Description
🚀 The feature, motivation and pitch
The recommended flow (and only flow mentioned in our documentation) when lowering programs to CoreML is the to_edge_lower_and_transform flow (https://github.com/pytorch/executorch/blob/main/docs/source/backends/coreml/coreml-overview.md#using-the-core-ml-backend). In this flow, the user lowers to CoreML with:
ep = torch.export.export(model, example_inputs)
et_program = to_edge_transform_and_lower(
ep,
partitioner=[CoreMLPartitioner()],
).to_executorch()
There is an older flow that uses to_edge that looks like this:
ep = torch.export.export(model, example_inputs)
edge_program = to_edge(
ep,
)
edge_program = edge_program.to_backend(CoreMLPartitioner())
et_program = edge_program.to_executorch()
The to_edge flow often has lower performance. The reason is that during to_edge, ExecuTorch decomposes many ops (e.g., SDPA) that CoreML has optimized implementations for. In contrast, to_edge_transform_and_lower first checks if CoreML supports an op before decomposing it.
The task:
When users use the older flow based on to_edge, we want to surface a warning that performance might be regressed, and they should instead use to_edge_lower_and_transform. The warning should also link users to the coreml backend docs.
Alternatives
NA
Additional context
NA
RFC (Optional)
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status