Conversation
- Added comments to clarify file purposes in example_commands.sh, inference_wan.py, pretrain_wan.py, wan_provider.py, wan_step.py, and wan.py. - Introduced EnergonMultiModalDataModule for handling multimodal datasets in nemo_vfm. - Created SequentialMegatronSampler for efficient sequential sampling in large datasets. - Added new files for DIT attention and base data handling. This commit enhances documentation and introduces new functionalities for better data management and processing.
abhinavg4
left a comment
There was a problem hiding this comment.
Thanks a lot for the PR. I think a bunch of changes need to be done. Especially please remove all the debugging stuff and add copyright headers. Thanks a lot.
dfm/src/megatron/data/dit/utils.py
Outdated
| return cropped_tensor | ||
|
|
||
|
|
||
| def test_no_cropping_needed(): |
There was a problem hiding this comment.
Can you remove these tests and move them to the tests folder?
There was a problem hiding this comment.
I would refrain from modifying DiT's related data code.
I would leave that for Sajad, so he can have an overall view of what is edited for DiT.
This file is in this PR only because diffusion_energon_datamodule.py is needed for Wan data energon.
There was a problem hiding this comment.
hey, DiT does not use minimal_crop, so this file is not present in my final branch. I think this comment has to be addressed on this branch.
|
/ok to test 13968fc |
| "Topic :: Utilities", | ||
| ] | ||
| dependencies = [ | ||
| "diffusers==0.35.1", |
|
/ok to test b1c41fc |
|
/ok to test d8bcade |
d8bcade to
681145b
Compare
|
/ok to test 681145b |
* first commit * workable code * workable thd * clean up, remove all CP for sbhd, CP now is only for thd * run outside of Mbridge * Update example scripts and add new data module for multimodal datasets - Added comments to clarify file purposes in example_commands.sh, inference_wan.py, pretrain_wan.py, wan_provider.py, wan_step.py, and wan.py. - Introduced EnergonMultiModalDataModule for handling multimodal datasets in nemo_vfm. - Created SequentialMegatronSampler for efficient sequential sampling in large datasets. - Added new files for DIT attention and base data handling. This commit enhances documentation and introduces new functionalities for better data management and processing. * workable code before refactoring * refactor attention submodules + reorder files locations * update refactor * update refactor * reorganize files * reorganize files * refactoring code * add README for perf test * using vae, t5, scheduler from Diffusers * update repo, remove Wan's Github moduels * fix Ruff * fix ruff + copyright * fix Ruff + Lint * fix Ruff + Lint * fix Ruff + Lint * fix Ruff + Lint * fix Ruff + Lint * fix Ruff + Lint * fix Ruff + Lint * fix Ruff + Lint * merged main + address comments * remove example_commands.md, Google waits until mid Nov * refactor inference_configs + mockdatamodule * add dit_embeddings.py * fix lint ruff * add 'average_gradients_across_tp_domain' to torch.nn for when running sequence_parallelism * add english negative prompt * fix ruff lint * Update uv.lock for deps: diffusers==0.35.1, easydict, imageio * update dfm/src/megatron/data/dit * change english negative prompt * seem to workable seq_packing * refactor with Sajad's PR - DiT data to common dir * fix Ruff, lint * fix Ruff, lint * fix Ruff, lint * workable mock datamodule (doesn't need setting path); updated training algo + hyper-parameters aligning with Linnan; tested training with anime dataset finetung * bring wan_task encoders features to common, sharing with dit * lint, ruff * lint, ruff * lint, ruff * fix CP error (input of thd_split_inputs_cp to be cu_seqlens_q_padded instead of cu_seqlens_q) * udpate README_perf_test.md * fix lint, ruff * update uv.lock, merge main * uv.lock * uv.lock * uv.lock * update uv.lock [using ci] --------- Co-authored-by: Huy Vu2 <huvu@login-eos02.eos.clusters.nvidia.com> Co-authored-by: Abhinav Garg <abhinavg@stanford.edu> Co-authored-by: root <root@eos0025.eos.clusters.nvidia.com> Co-authored-by: root <root@eos0558.eos.clusters.nvidia.com> Co-authored-by: Pablo Garay <pagaray@nvidia.com>

No description provided.