Skip to content

Commit 77e1e6a

Browse files
committed
modify automatic batching doc
1 parent d7816cc commit 77e1e6a

File tree

2 files changed

+19
-2
lines changed

2 files changed

+19
-2
lines changed

pina/data/data_module.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,16 @@ def __init__(
8181
:param dict max_conditions_lengths: ``dict`` containing the maximum
8282
number of data points to consider in a single batch for
8383
each condition.
84-
:param bool automatic_batching: Whether to enable automatic batching.
84+
:param bool automatic_batching: Whether to enable automatic batching.
85+
If ``True``, automatic PyTorch batching
86+
is performed, which consists of extracting one element at a time
87+
from the dataset and collating them into a batch. This is useful
88+
when the dataset is too large to fit into memory. On the other hand,
89+
if ``False``, the items are retrieved from the dataset all at once
90+
avoind the overhead of collating them into a batch and reducing the
91+
__getitem__ calls to the dataset. This is useful when the dataset
92+
fits into memory. Avoid using automatic batching when ``batch_size``
93+
is large. Default is ``False``.
8594
:param PinaDataset dataset: The dataset where the data is stored.
8695
"""
8796

pina/trainer.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,15 @@ def _create_datamodule(
170170
validation dataset.
171171
:param int batch_size: The number of samples per batch to load.
172172
:param bool automatic_batching: Whether to perform automatic batching
173-
with PyTorch.
173+
with PyTorch. If ``True``, automatic PyTorch batching
174+
is performed, which consists of extracting one element at a time
175+
from the dataset and collating them into a batch. This is useful
176+
when the dataset is too large to fit into memory. On the other hand,
177+
if ``False``, the items are retrieved from the dataset all at once
178+
avoind the overhead of collating them into a batch and reducing the
179+
__getitem__ calls to the dataset. This is useful when the dataset
180+
fits into memory. Avoid using automatic batching when ``batch_size``
181+
is large. Default is ``False``.
174182
:param bool pin_memory: Whether to use pinned memory for faster data
175183
transfer to GPU.
176184
:param int num_workers: The number of worker threads for data loading.

0 commit comments

Comments
 (0)