You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve memory planning for submodule hierarchies. (#11860)
Summary:
Pull Request resolved: #11860
Improves the memory planning across hierarchies in apply_algo in memory_planning.py:
1. Plan memory bottom-to-top, starting with the leaf submodules and ending at top-level graph module (root). This is now consistent with how delegates are compiled / memory planned. Future PRs/diffs will add support for planned buffers in delegates.
2. Allocate max bufsize for all submodules as `graph_module.meta['input_mem_buffer_sizes']`, rather than sum. This allows us to reclaim the space used by one submodule for another submodule.
Before this change the apply_algo in memory_planning.py would:
1. Plan memory top-to-bottom, starting with the top-level graph module (root).
2. Populate the `input_mem_buffer_sizes` so that each new submodule will allocate memory after the max buffer size of previous memory.
For example:
```
root [A bytes]
- root.child0 [B bytes]
- root.child0.child0 [C bytes]
- root.child1 [D bytes]
```
(before this diff) Planned memory looks like:
```
--- A + B + C + D ----------------
Space for root.child1
--- A + B + C --------------------
Space for root.child0.child0
--- A + B ------------------------
Space for root.child0
--- A ----------------------------
Space for root
--- 0 ----------------------------
```
Note that tensors for child0 and child1 have no overlap but still use completely different space.
(after this diff) Planned memory looks like:
```
--- max(C + B, D) + A ----------
root
--- max(C + B, D) --------------
root.child0 |
--- C ------------ | root.child1
root.child0.child0 |
--- 0 --------------------------
```
Note:
We can update memory planning algo to plan nodes with submodules (while/map/cond or even delegate) to use `graph_module.meta['non_const_buffer_size']` and reduce space even further. Implementation for this is not part of this PR/Diff. This will allow us to reuse space for `root.child0.child0` in `root.child0`, and space for `root.child0`/`root.child1` in `root.
Reviewed By: JacobSzwejbka
Differential Revision: D76940237
0 commit comments