Skip to content

Commit 313fb50

Browse files
committed
Update comments
1 parent 84bafb3 commit 313fb50

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

trainer/train_from_cached.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -309,13 +309,17 @@ def main():
309309

310310
shortest_dl_len = min(len(dl) for dl in dataloaders)
311311
if len(dataloaders) > 1:
312-
print("Will truncate all to shortest length:", shortest_dl_len * bs)
313-
# Truncation is effectively done by use of zip, lower down.
314-
# We dont actually change the objs here. But we DO use this to calculate
315-
# steps_per_epoch, which is important
312+
print("Common shortest effective length:", shortest_dl_len * bs)
313+
# Our use of zip lower down, effectively truncates them all to the same length,
314+
# PER RUN.
315+
# We dont actually change the length here. But we DO use this to calculate
316+
# steps_per_epoch, which is important.
317+
# Note that the longer datasets will use DIFFERENT subsets of themselves
318+
# over each epoch!!
316319

317320
steps_per_epoch = shortest_dl_len * len(dataloaders)
318-
# dl count already divided by mini batch
321+
# dl count already divided by micro batch size.
322+
# So now calculate EBS steps
319323
steps_per_epoch = steps_per_epoch // accum
320324

321325
if args.max_steps and args.max_steps.endswith("e"):

0 commit comments

Comments
 (0)