Skip to content

Commit e223932

Browse files
Nic-Mawyli
andauthored
2796 Enhance doc-string of cache datasets (#2799)
* [DLMED] enhance doc Signed-off-by: Nic Ma <[email protected]> * Update monai/data/dataset.py Co-authored-by: Wenqi Li <[email protected]> Signed-off-by: Nic Ma <[email protected]> * Update monai/data/dataset.py Co-authored-by: Wenqi Li <[email protected]> Signed-off-by: Nic Ma <[email protected]> Co-authored-by: Wenqi Li <[email protected]>
1 parent 28856b8 commit e223932

File tree

1 file changed

+17
-10
lines changed

1 file changed

+17
-10
lines changed

monai/data/dataset.py

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -103,24 +103,28 @@ class PersistentDataset(Dataset):
103103
If passing slicing indices, will return a PyTorch Subset, for example: `data: Subset = dataset[1:4]`,
104104
for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset
105105
106+
The transforms which are supposed to be cached must implement the `monai.transforms.Transform`
107+
interface and should not be `Randomizable`. This dataset will cache the outcomes before the first
108+
`Randomizable` `Transform` within a `Compose` instance.
109+
106110
For example, typical input data can be a list of dictionaries::
107111
108112
[{ { {
109-
'image': 'image1.nii.gz', 'image': 'image2.nii.gz', 'image': 'image3.nii.gz',
110-
'label': 'label1.nii.gz', 'label': 'label2.nii.gz', 'label': 'label3.nii.gz',
111-
'extra': 123 'extra': 456 'extra': 789
112-
}, }, }]
113+
'image': 'image1.nii.gz', 'image': 'image2.nii.gz', 'image': 'image3.nii.gz',
114+
'label': 'label1.nii.gz', 'label': 'label2.nii.gz', 'label': 'label3.nii.gz',
115+
'extra': 123 'extra': 456 'extra': 789
116+
}, }, }]
113117
114118
For a composite transform like
115119
116120
.. code-block:: python
117121
118122
[ LoadImaged(keys=['image', 'label']),
119-
Orientationd(keys=['image', 'label'], axcodes='RAS'),
120-
ScaleIntensityRanged(keys=['image'], a_min=-57, a_max=164, b_min=0.0, b_max=1.0, clip=True),
121-
RandCropByPosNegLabeld(keys=['image', 'label'], label_key='label', spatial_size=(96, 96, 96),
122-
pos=1, neg=1, num_samples=4, image_key='image', image_threshold=0),
123-
ToTensord(keys=['image', 'label'])]
123+
Orientationd(keys=['image', 'label'], axcodes='RAS'),
124+
ScaleIntensityRanged(keys=['image'], a_min=-57, a_max=164, b_min=0.0, b_max=1.0, clip=True),
125+
RandCropByPosNegLabeld(keys=['image', 'label'], label_key='label', spatial_size=(96, 96, 96),
126+
pos=1, neg=1, num_samples=4, image_key='image', image_threshold=0),
127+
ToTensord(keys=['image', 'label'])]
124128
125129
Upon first use a filename based dataset will be processed by the transform for the
126130
[LoadImaged, Orientationd, ScaleIntensityRanged] and the resulting tensor written to
@@ -524,7 +528,10 @@ class CacheDataset(Dataset):
524528
Users can set the cache rate or number of items to cache.
525529
It is recommended to experiment with different `cache_num` or `cache_rate` to identify the best training speed.
526530
527-
To improve the caching efficiency, please always put as many as possible non-random transforms
531+
The transforms which are supposed to be cached must implement the `monai.transforms.Transform`
532+
interface and should not be `Randomizable`. This dataset will cache the outcomes before the first
533+
`Randomizable` `Transform` within a `Compose` instance.
534+
So to improve the caching efficiency, please always put as many as possible non-random transforms
528535
before the randomized ones when composing the chain of transforms.
529536
If passing slicing indices, will return a PyTorch Subset, for example: `data: Subset = dataset[1:4]`,
530537
for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset

0 commit comments

Comments
 (0)