Skip to content

Commit ad9849c

Browse files
authored
Fix data_load_start position (#1481)
# Fix incorrect data loading time measurement This PR fixes the timing of data_loading_times measurement in batch_generator. Previously, the timer started after calling next(data_iterator), which excluded the actual data fetching time from the measurement. Now, the timer starts before the next() call to correctly capture the full DataLoader latency.
1 parent 1080c8f commit ad9849c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

torchtitan/train.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -373,13 +373,13 @@ def batch_generator(
373373
data_iterator = iter(data_iterable)
374374

375375
while True:
376+
data_load_start = time.perf_counter()
376377
try:
377378
batch = next(data_iterator)
378379
except StopIteration as ex:
379380
# If data runs out during gradient accumulation, that
380381
# entire step will not be executed.
381382
raise DataloaderStopIteration() from ex
382-
data_load_start = time.perf_counter()
383383
input_dict, labels = batch
384384
ntokens_batch = labels.numel()
385385
self.ntokens_seen += ntokens_batch

0 commit comments

Comments
 (0)