-
Notifications
You must be signed in to change notification settings - Fork 683
Module support for multiple ptd files #14158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Support multiple PTD files in Module. This change updates the following private variables in Module: ``` std::string data_path --> std::unordered_set<std::string> data_files_ std::unique_ptr<DataLoader> data_map_loader --> std::vectror<std::unique_ptr<DataLoader>> data_map_loaders_ std::unique_ptr<NamedDataMap> data_map --> std::vector<std::unique_ptr<NamedDataMap> named_data_maps_ ``` And introduces a new private variable. When we have multiple NamedDataMaps, they need to be merged into one, for use in method, etc. This is not implemented yet. ``` std::unique_ptr<NamedDataMap> merged_data_map_ ``` The process of using a PTD file is: ``` std::string file --> wrapped in DataLoader --> wrapped in NamedDataMap. ``` At each stage we can have multiple. This diff also introduces a new Module constructor that takes in `std::unordered_set<std::string> named_data_map_paths_` Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14158
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit e6ed284 with merge base f7c009e ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Support multiple PTD files in Module. This change updates the following private variables in Module: ``` std::string data_path --> std::unordered_set<std::string> data_files_ std::unique_ptr<DataLoader> data_map_loader --> std::vectror<std::unique_ptr<DataLoader>> data_map_loaders_ std::unique_ptr<NamedDataMap> data_map --> std::vector<std::unique_ptr<NamedDataMap> named_data_maps_ ``` And introduces a new private variable. When we have multiple NamedDataMaps, they need to be merged into one, for use in method, etc. This is not implemented yet. ``` std::unique_ptr<NamedDataMap> merged_data_map_ ``` The process of using a PTD file is: ``` std::string file --> wrapped in DataLoader --> wrapped in NamedDataMap. ``` At each stage we can have multiple. This diff also introduces a new Module constructor that takes in `std::unordered_set<std::string> named_data_map_paths_` Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/) ghstack-source-id: 308798953 Pull Request resolved: #14158
This pull request was exported from Phabricator. Differential Revision: D82059808 |
This PR needs a
|
Support multiple PTD files in Module. This change updates the following private variables in Module: ``` std::string data_path --> std::unordered_set<std::string> data_files_ std::unique_ptr<DataLoader> data_map_loader --> std::vectror<std::unique_ptr<DataLoader>> data_map_loaders_ std::unique_ptr<NamedDataMap> data_map --> std::vector<std::unique_ptr<NamedDataMap> named_data_maps_ ``` And introduces a new private variable. When we have multiple NamedDataMaps, they need to be merged into one, for use in method, etc. This is not implemented yet. ``` std::unique_ptr<NamedDataMap> merged_data_map_ ``` The process of using a PTD file is: ``` std::string file --> wrapped in DataLoader --> wrapped in NamedDataMap. ``` At each stage we can have multiple. This diff also introduces a new Module constructor that takes in `std::unordered_set<std::string> named_data_map_paths_` Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/) [ghstack-poisoned]
Pull Request resolved: #14158 Support multiple PTD files in Module. Context: https://docs.google.com/document/d/19RLLdWNHQoRi8Ufz4oE-gGjOz0IShjN_NZi5jlgMBZI/edit?tab=t.0 This change updates the following private variables in Module: ``` std::string data_path --> std::unordered_set<std::string> data_files_ std::unique_ptr<DataLoader> data_map_loader --> std::vectror<std::unique_ptr<DataLoader>> data_map_loaders_ std::unique_ptr<NamedDataMap> data_map --> std::vector<std::unique_ptr<NamedDataMap> named_data_maps_ ``` And introduces a new private variable. When we have multiple NamedDataMaps, they need to be merged into one, for use in method, etc. This is not implemented yet. ``` std::unique_ptr<NamedDataMap> merged_data_map_ ``` The process of using a PTD file is: ``` std::string file --> wrapped in DataLoader --> wrapped in NamedDataMap. ``` At each stage we can have multiple. This diff also introduces a new Module constructor that takes in `std::unordered_set<std::string> named_data_map_paths_` TODO: add a MergedDataMap to extension/module that can merge all the data maps together. ghstack-source-id: 308975994 @exported-using-ghexport Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/)
This pull request was exported from Phabricator. Differential Revision: D82059808 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review automatically exported from Phabricator review in Meta.
Support multiple PTD files in Module. This change updates the following private variables in Module: ``` std::string data_path --> std::unordered_set<std::string> data_files_ std::unique_ptr<DataLoader> data_map_loader --> std::vectror<std::unique_ptr<DataLoader>> data_map_loaders_ std::unique_ptr<NamedDataMap> data_map --> std::vector<std::unique_ptr<NamedDataMap> named_data_maps_ ``` And introduces a new private variable. When we have multiple NamedDataMaps, they need to be merged into one, for use in method, etc. This is not implemented yet. ``` std::unique_ptr<NamedDataMap> merged_data_map_ ``` The process of using a PTD file is: ``` std::string file --> wrapped in DataLoader --> wrapped in NamedDataMap. ``` At each stage we can have multiple. This diff also introduces a new Module constructor that takes in `std::unordered_set<std::string> named_data_map_paths_` Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/) [ghstack-poisoned]
Pull Request resolved: #14158 Support multiple PTD files in Module. Context: https://docs.google.com/document/d/19RLLdWNHQoRi8Ufz4oE-gGjOz0IShjN_NZi5jlgMBZI/edit?tab=t.0 This change updates the following private variables in Module: ``` std::string data_path --> std::unordered_set<std::string> data_files_ std::unique_ptr<DataLoader> data_map_loader --> std::vectror<std::unique_ptr<DataLoader>> data_map_loaders_ std::unique_ptr<NamedDataMap> data_map --> std::vector<std::unique_ptr<NamedDataMap> named_data_maps_ ``` And introduces a new private variable. When we have multiple NamedDataMaps, they need to be merged into one, for use in method, etc. This is not implemented yet. ``` std::unique_ptr<NamedDataMap> merged_data_map_ ``` The process of using a PTD file is: ``` std::string file --> wrapped in DataLoader --> wrapped in NamedDataMap. ``` At each stage we can have multiple. This diff also introduces a new Module constructor that takes in `std::unordered_set<std::string> named_data_map_paths_` TODO: add a MergedDataMap to extension/module that can merge all the data maps together. ghstack-source-id: 313188117 @exported-using-ghexport Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/)
b373abc
into
gh/lucylq/110/base
This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: #14158 by @lucylq ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/lucylq/110/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/110/head Merge bot PR base: https://github.com/pytorch/executorch/tree/main Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/110/orig Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/) @diff-train-skip-merge Co-authored-by: lucylq <[email protected]>
@pytorchbot cherry-pick --onto release/1.0 -c examples |
❌ 🤖 pytorchbot command failed:
Try |
This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: #14158 by @lucylq ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/lucylq/110/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/110/head Merge bot PR base: https://github.com/pytorch/executorch/tree/main Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/110/orig Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/) @diff-train-skip-merge Co-authored-by: lucylq <[email protected]> (cherry picked from commit 421539e)
This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: #14158 by @lucylq ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/lucylq/110/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/110/head Merge bot PR base: https://github.com/pytorch/executorch/tree/main Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/lucylq/110/orig Differential Revision: [D82059808](https://our.internmc.facebook.com/intern/diff/D82059808/) @diff-train-skip-merge Co-authored-by: lucylq <[email protected]>
Stack from ghstack (oldest at bottom):
Support multiple PTD files in Module. This change updates the following private variables in Module:
And introduces a new private variable. When we have multiple NamedDataMaps, they need to be merged into one, for use in method, etc. This is not implemented yet.
The process of using a PTD file is:
At each stage we can have multiple.
This diff also introduces a new Module constructor that takes in
std::unordered_set<std::string> named_data_map_paths_
Differential Revision: D82059808