Skip to content

Supporting Wan model#21

Merged
huvunvidia merged 61 commits intomainfrom
huvu/mcore_wan
Nov 13, 2025
Merged

Supporting Wan model#21
huvunvidia merged 61 commits intomainfrom
huvu/mcore_wan

Conversation

@huvunvidia
Copy link
Contributor

No description provided.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 30, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Huy Vu2 and others added 12 commits October 30, 2025 13:28
- Added comments to clarify file purposes in example_commands.sh, inference_wan.py, pretrain_wan.py, wan_provider.py, wan_step.py, and wan.py.
- Introduced EnergonMultiModalDataModule for handling multimodal datasets in nemo_vfm.
- Created SequentialMegatronSampler for efficient sequential sampling in large datasets.
- Added new files for DIT attention and base data handling.

This commit enhances documentation and introduces new functionalities for better data management and processing.
Copy link
Contributor

@abhinavg4 abhinavg4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR. I think a bunch of changes need to be done. Especially please remove all the debugging stuff and add copyright headers. Thanks a lot.

return cropped_tensor


def test_no_cropping_needed():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove these tests and move them to the tests folder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would refrain from modifying DiT's related data code.
I would leave that for Sajad, so he can have an overall view of what is edited for DiT.
This file is in this PR only because diffusion_energon_datamodule.py is needed for Wan data energon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, @sajadn Can you please remove this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey, DiT does not use minimal_crop, so this file is not present in my final branch. I think this comment has to be addressed on this branch.

@huvunvidia huvunvidia requested a review from a team as a code owner November 5, 2025 07:14
@huvunvidia
Copy link
Contributor Author

/ok to test 13968fc

"Topic :: Utilities",
]
dependencies = [
"diffusers==0.35.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These (new lines for) dependencies were already added below in this pyproject.toml file, hence i think we dont need to add them here anymore

image

abhinavg4
abhinavg4 previously approved these changes Nov 12, 2025
Copy link
Contributor

@abhinavg4 abhinavg4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks

@huvunvidia
Copy link
Contributor Author

/ok to test b1c41fc

@pablo-garay
Copy link
Contributor

/ok to test d8bcade

@pablo-garay
Copy link
Contributor

/ok to test 681145b

@huvunvidia huvunvidia merged commit ddd4fe8 into main Nov 13, 2025
15 checks passed
huvunvidia added a commit that referenced this pull request Feb 12, 2026
* first commit

* workable code

* workable thd

* clean up, remove all CP for sbhd, CP now is only for thd

* run outside of Mbridge

* Update example scripts and add new data module for multimodal datasets

- Added comments to clarify file purposes in example_commands.sh, inference_wan.py, pretrain_wan.py, wan_provider.py, wan_step.py, and wan.py.
- Introduced EnergonMultiModalDataModule for handling multimodal datasets in nemo_vfm.
- Created SequentialMegatronSampler for efficient sequential sampling in large datasets.
- Added new files for DIT attention and base data handling.

This commit enhances documentation and introduces new functionalities for better data management and processing.

* workable code before refactoring

* refactor attention submodules + reorder files locations

* update refactor

* update refactor

* reorganize files

* reorganize files

* refactoring code

* add README for perf test

* using vae, t5, scheduler from Diffusers

* update repo, remove Wan's Github moduels

* fix Ruff

* fix ruff + copyright

* fix Ruff + Lint

* fix Ruff + Lint

* fix Ruff + Lint

* fix Ruff + Lint

* fix Ruff + Lint

* fix Ruff + Lint

* fix Ruff + Lint

* fix Ruff + Lint

* merged main + address comments

* remove example_commands.md, Google waits until mid Nov

* refactor inference_configs + mockdatamodule

* add dit_embeddings.py

* fix lint ruff

* add 'average_gradients_across_tp_domain' to torch.nn for when running sequence_parallelism

* add english negative prompt

* fix ruff lint

* Update uv.lock for deps: diffusers==0.35.1, easydict, imageio

* update dfm/src/megatron/data/dit

* change english negative prompt

* seem to workable seq_packing

* refactor with Sajad's PR - DiT data to common dir

* fix Ruff, lint

* fix Ruff, lint

* fix Ruff, lint

* workable mock datamodule (doesn't need setting path); updated training algo + hyper-parameters aligning with Linnan; tested training with anime dataset finetung

* bring wan_task encoders features to common, sharing with dit

* lint, ruff

* lint, ruff

* lint, ruff

* fix CP error (input of thd_split_inputs_cp to be cu_seqlens_q_padded instead of cu_seqlens_q)

* udpate README_perf_test.md

* fix lint, ruff

* update uv.lock, merge main

* uv.lock

* uv.lock

* uv.lock

* update uv.lock [using ci]

---------

Co-authored-by: Huy Vu2 <huvu@login-eos02.eos.clusters.nvidia.com>
Co-authored-by: Abhinav Garg <abhinavg@stanford.edu>
Co-authored-by: root <root@eos0025.eos.clusters.nvidia.com>
Co-authored-by: root <root@eos0558.eos.clusters.nvidia.com>
Co-authored-by: Pablo Garay <pagaray@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants