Skip to content

Commit 06afc5e

Browse files
committed
fix use seq_length instead of packing_buffer_size to set max number of tokens.
Signed-off-by: Sajad Norouzi <snorouzi@nvidia.com>
1 parent 0fd3a66 commit 06afc5e

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

dfm/src/megatron/data/dit/diffusion_task_encoder_with_sp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def select_samples_to_pack(self, samples: List[DiffusionSample]) -> List[List[Di
6565
"""
6666
Selects sequences to pack for mixed image-video training.
6767
"""
68-
results = first_fit_decreasing(samples, self.packing_buffer_size)
68+
results = first_fit_decreasing(samples, self.seq_length)
6969
random.shuffle(results)
7070
return results
7171

dfm/src/megatron/recipes/dit/dit.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -185,8 +185,8 @@ def pretrain_config(
185185
dataset=DiffusionDataModuleConfig(
186186
path=dataset_path,
187187
seq_length=2048,
188-
task_encoder_seq_length=2048,
189-
packing_buffer_size=8000,
188+
task_encoder_seq_length=8000,
189+
packing_buffer_size=32,
190190
micro_batch_size=micro_batch_size,
191191
global_batch_size=global_batch_size,
192192
num_workers=10,

0 commit comments

Comments
 (0)