You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/MaxText/configs/base.yml
+3-1Lines changed: 3 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -599,8 +599,10 @@ grain_train_mixture_config_path: '' # Path to a JSON file specifying the mixture
599
599
grain_file_type: 'arrayrecord'# arrayrecord or parquet
600
600
grain_worker_count: 1# Set to -1 to enable auto-tuning: automatically determines optimal worker count. See https://google-grain.readthedocs.io/en/latest/_autosummary/grain.experimental.pick_performance_config.html
601
601
grain_per_worker_buffer_size: 1
602
-
# num_threads and prefetch_buffer_size are per-worker per-dataset. Used in ReadOptions (https://google-grain.readthedocs.io/en/latest/tutorials/data_loader_tutorial.html#per-worker-readoptions)
602
+
# num_threads and prefetch_buffer_size are per-worker per-dataset.
603
+
# When using array_records, they are used in ReadOptions (https://google-grain.readthedocs.io/en/latest/tutorials/data_loader_tutorial.html#per-worker-readoptions)
603
604
# The default value matches that in the Grain package. If mixing multiple data sources, consider lowering these values to reduce memory usage.
605
+
# When using parquet, grain_num_threads is the number of files to read and interleave in parallel
0 commit comments