Skip to content

Conversation

@BujSet
Copy link
Contributor

@BujSet BujSet commented Jun 17, 2025

Motivation

Given a specific model, we want to produce a binary that only includes the minimal operators and dtypes needed to run the model. This requires parsing the model to determine what kernels it launch, the operators used in those kernels, and the dtypes of the tensors in the kernels. After parsing, a header file must be generated, and the portable_kernels lib can be rebuilt to only include the operators and dtypes specified in the generated header.

Summary

This changes completes this E2E process. A user can now specify the model they wish to optimize their binary for via the command line argument -DEXECUTORCH_SELECT_OPS_FROM_MODEL="<file path to model pte>". When specified, the pte is parsed to produce a YAML file called seleced_operators.yaml which describes the model's operators and dtypes. From this YAML, a header file called selected_op_variants.h is generated that selects the described operators and dtypes. When command line argument -DEXECUTORCH_DTYPE_SELECTIVE_BUILD=ON is specified, the header file is linked to the portable_kernels lib when it's rebuilt. Only the model API is supported with dtype selective build, and using other methods such as list or dict will results in a build error.

Results

An example usage of this flow is included in examples/selective_build/test_selective_build.sh:test_cmake_select_ops_in_model. When run as bash examples/selective_build/test_selective_build.sh cmake, the cmake-out/examples/selective_build/selective_build_test binary is built. After stripping the binary, the following binary size results were seen with the following models:

Working models

Model Default Binary Size (KB) Dtype Selected Binary Size (KB)
add 359 275
mul 335 263
add_mul 367 287
linear 347 291
softmax 251 251
resnet18 643 515
resnet50 643 515
mobilebert 707 415
lstm 643 459
dl3 607 539
edsr 543 371

Models that crash when run

Model Default Binary Size (KB) Dtype Selected Binary Size (KB)
emformer_transcribe 863 555
vit 907 687
mv2 631 495
mv3 843 535
llama 1.1M 827
qwen2_5 1.1M 827

Notes

Although there is a noted reduction in the binary size, it seems that the pte file parsing functionality from gen_oplist.py is incomplete. Please see the discussion on PR #11582 and on issue #11762 for more details.

@pytorch-bot
Copy link

pytorch-bot bot commented Jun 17, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11760

Note: Links to docs will display an error until the docs builds have been completed.

❌ 12 New Failures, 1 Unrelated Failure

As of commit eb3fc1e with merge base fcc7f3b (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 17, 2025
@BujSet BujSet changed the title Dtype selective build for cmake dtype selective build from model API in OSS Jun 17, 2025
@BujSet
Copy link
Contributor Author

BujSet commented Jun 17, 2025

@pytorchbot label "release notes: none"

@pytorch-bot pytorch-bot bot added the release notes: none Do not include this in the release notes label Jun 17, 2025
@BujSet BujSet self-assigned this Jun 17, 2025
@BujSet BujSet force-pushed the dtype_selective_build_for_cmake branch 5 times, most recently from 4e2792b to 7d823f2 Compare June 17, 2025 23:32
@BujSet BujSet marked this pull request as ready for review June 18, 2025 02:41
endif()
endif()

if(GEN_KERNEL_LIBS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary I think — aren't we already in an if(GEN_KERNEL_LIBS) block from line 241?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that if portable_kernels is listed in GEN_KERNEL_LIBS, we remove it from the list (line 243). I thought it might be possible that there may be multiple options passed in here, so this check ensure that if others are specified, they still get linked. I'm not entirely sure if this use case is probable though?

@BujSet BujSet force-pushed the dtype_selective_build_for_cmake branch from f1c6439 to ed3abb7 Compare June 18, 2025 17:22
@BujSet BujSet force-pushed the dtype_selective_build_for_cmake branch from ed3abb7 to eb3fc1e Compare June 18, 2025 19:39
@BujSet BujSet merged commit daebcde into pytorch:main Jun 18, 2025
354 of 378 checks passed
leafs1 pushed a commit to leafs1/executorch that referenced this pull request Jun 19, 2025
Given a specific model, we want to produce a binary that only includes
the minimal operators and dtypes needed to run the model. This requires
parsing the model to determine what kernels it launch, the operators
used in those kernels, and the dtypes of the tensors in the kernels.
After parsing, a header file must be generated, and the portable_kernels
lib can be rebuilt to only include the operators and dtypes specified in
the generated header.

This changes completes this E2E process. A user can now specify the
model they wish to optimize their binary for via the command line
argument `-DEXECUTORCH_SELECT_OPS_FROM_MODEL="<file path to model
pte>"`. When specified, the pte is parsed to produce a YAML file called
`seleced_operators.yaml` which describes the model's operators and
dtypes. From this YAML, a header file called `selected_op_variants.h` is
generated that selects the described operators and dtypes. When command
line argument `-DEXECUTORCH_DTYPE_SELECTIVE_BUILD=ON` is specified, the
header file is linked to the `portable_kernels` lib when it's rebuilt.
Only the model API is supported with dtype selective build, and using
other methods such as `list` or `dict` will results in a build error.

An example usage of this flow is included in
`examples/selective_build/test_selective_build.sh:test_cmake_select_ops_in_model`.
When run as `bash examples/selective_build/test_selective_build.sh
cmake`, the `cmake-out/examples/selective_build/selective_build_test`
binary is built. After stripping the binary, the following binary size
results were seen with the following models:

| Model | Default Binary Size (KB) | Dtype Selected Binary Size (KB) |
| ------- | :---:| :---:|
|add | 359 | 275|
|mul | 335 | 263 |
| add_mul| 367 | 287 |
| linear | 347 | 291 |
| softmax | 251 | 251 |
|resnet18| 643 | 515 |
|resnet50 | 643 | 515 |
|mobilebert| 707 | 415 |
|lstm| 643 | 459|
| dl3| 607 | 539|
|edsr | 543 | 371|

| Model | Default Binary Size (KB) | Dtype Selected Binary Size (KB) |
| ------- | :---:| :---:|
|emformer_transcribe| 863| 555|
|vit| 907| 687 |
|mv2| 631 | 495 |
|mv3| 843 | 535 |
|llama | 1.1M | 827|
|qwen2_5 | 1.1M | 827|

Although there is a noted reduction in the binary size, it seems that
the pte file parsing functionality from `gen_oplist.py` is incomplete.
Please see the discussion on [PR
details.
@BujSet BujSet deleted the dtype_selective_build_for_cmake branch June 19, 2025 00:01
@BujSet
Copy link
Contributor Author

BujSet commented Jul 23, 2025

Tested with openai's whisper-tiny model. Dtype selection helps bring the binary size down to 523 KB, but it still crashes based on the same issues mentioned above.

cc @lucylq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants