-
Notifications
You must be signed in to change notification settings - Fork 752
Refactoring memory planning to allow running multiple algorithms #8440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8440
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit c2fa1c1 with merge base 94ec549 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: Pull Request resolved: #8440 Differential Revision: D69515056
e6b5faf to
8eed6f3
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
8eed6f3 to
af63ca9
Compare
Summary: Pull Request resolved: #8440 Differential Revision: D69515056
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Differential Revision: D69515056
af63ca9 to
32e3fa4
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Differential Revision: D69515056
32e3fa4 to
b323ab5
Compare
|
What is the regression on lowering time? |
@JacobSzwejbka from a measurement on llama3 it takes about an additional 100ms to run 2 algorithms and pick the best one. |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Differential Revision: D69515056
b323ab5 to
ba0fa16
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
69081ff to
27d74e2
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
27d74e2 to
a2cfdc9
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
a2cfdc9 to
a8802ae
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
a8802ae to
f365bba
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
f365bba to
89496fc
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
89496fc to
1e51dc0
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
1e51dc0 to
1a9b21f
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
1a9b21f to
ff1e53b
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Pull Request resolved: #8440 Reviewed By: JacobSzwejbka Differential Revision: D69515056
ff1e53b to
f91623a
Compare
Summary: This diff introduces `memory_planning_algorithm_suite` which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed. The requirement for each of these algorithms is that they should generate a `MemoryAlgoResult` that contains the results of the memory planning done by that algorithm. These algos like before don't update the `TensorSpec` directly, but rather in `memory_planning_algorithm_suite` we figure out which algo gave us the best result and then update the `TensorSpec`'s with values (offsets etc.) returned by that algo. Reviewed By: JacobSzwejbka Differential Revision: D69515056
f91623a to
c2fa1c1
Compare
|
This pull request was exported from Phabricator. Differential Revision: D69515056 |
This diff introduces memory_planning_algorithm_suite which is a method that allows us to iterate through multiple memory planning algorithms and pick the one that gives us the best results i.e. least memory consumed.
The requirement for each of these algorithms is that they should generate a MemoryAlgoResult that contains the results of the memory planning done by that algorithm. These algos like before don't update the TensorSpec directly, but rather in memory_planning_algorithm_suite we figure out which algo gave us the best result and then update the TensorSpec's with values (offsets etc.) returned by that algo.
Differential Revision: D69515056