Skip to content

Commit 660c645

Browse files
authored
Add albumentations to use dataset (#7596)
* Added example with albumentations to the use_dataset tutorial * cleanup * cleanup * Update use_dataset tutorial to integrate Albumentations for data augmentation - Replaced torchvision transforms with Albumentations for image augmentation. - Renumbered sections for clarity and updated descriptions accordingly. - Emphasized key points for using Albumentations with 🤗 Datasets. * Update use_dataset tutorial to finalize dataset preparation steps - Added a new key point emphasizing that the dataset is ready for training with machine learning frameworks. - Ensured clarity by maintaining consistent formatting and structure.
1 parent bb66b6c commit 660c645

File tree

1 file changed

+49
-8
lines changed

1 file changed

+49
-8
lines changed

docs/source/use_dataset.mdx

Lines changed: 49 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -175,22 +175,63 @@ Most image models expect the image to be in the RGB mode. The Beans images are a
175175
>>> dataset = dataset.cast_column("image", Image(mode="RGB"))
176176
```
177177

178-
**3**. Now, you can apply some transforms to the image. Feel free to take a look at the [various transforms available](https://pytorch.org/vision/stable/auto_examples/plot_transforms.html#sphx-glr-auto-examples-plot-transforms-py) in torchvision and choose one you'd like to experiment with. This example applies a transform that randomly rotates the image:
178+
**3**. Now let's apply data augmentations to your images. 🤗 Datasets works with any augmentation library, and in this example we'll use Albumentations.
179+
180+
### Using Albumentations
181+
182+
[Albumentations](https://albumentations.ai) is a popular image augmentation library that provides a [rich set of transforms](https://albumentations.ai/docs/reference/supported-targets-by-transform/) including spatial-level transforms, pixel-level transforms, and mixing-level transforms. When running on CPU, which is typical for transformers pipelines, Albumentations is [faster than torchvision](https://albumentations.ai/docs/benchmarks/image-benchmarks/).
183+
184+
Install Albumentations:
185+
186+
```bash
187+
pip install albumentations
188+
```
189+
190+
**4**. Create a typical augmentation pipeline with Albumentations:
179191

180192
```py
181-
>>> from torchvision.transforms import RandomRotation
193+
>>> import albumentations as A
194+
>>> import numpy as np
195+
>>> from PIL import Image
196+
197+
>>> transform = A.Compose([
198+
... A.RandomCrop(height=256, width=256, pad_if_needed=True, p=1),
199+
... A.HorizontalFlip(p=0.5),
200+
... A.ColorJitter(p=0.5)
201+
... ])
202+
```
203+
204+
**5**. Since 🤗 Datasets uses PIL images but Albumentations expects OpenCV format (numpy arrays), you need to convert between formats:
182205

183-
>>> rotate = RandomRotation(degrees=(0, 90))
184-
>>> def transforms(examples):
185-
... examples["pixel_values"] = [rotate(image) for image in examples["image"]]
206+
```py
207+
>>> def albumentations_transforms(examples):
208+
... # Apply Albumentations transforms
209+
... transformed_images = []
210+
... for image in examples["image"]:
211+
... # Convert PIL to numpy array (OpenCV format)
212+
... image_np = np.array(image.convert("RGB"))
213+
...
214+
... # Apply Albumentations transforms
215+
... transformed_image = transform(image=image_np)["image"]
216+
...
217+
... # Convert back to PIL Image
218+
... pil_image = Image.fromarray(transformed_image)
219+
... transformed_images.append(pil_image)
220+
...
221+
... examples["pixel_values"] = transformed_images
186222
... return examples
187223
```
188224

189-
**4**. Use the [`~Dataset.set_transform`] function to apply the transform on-the-fly. When you index into the image `pixel_values`, the transform is applied, and your image gets rotated.
225+
**6**. Apply the transform using [`~Dataset.set_transform`]:
190226

191227
```py
192-
>>> dataset.set_transform(transforms)
228+
>>> dataset.set_transform(albumentations_transforms)
193229
>>> dataset[0]["pixel_values"]
194230
```
195231

196-
**5**. The dataset is now ready for training with your machine learning framework!
232+
**Key points when using Albumentations with 🤗 Datasets:**
233+
- Convert PIL images to numpy arrays before applying transforms
234+
- Albumentations returns a dictionary with the transformed image under the "image" key
235+
- Convert the result back to PIL format after transformation
236+
237+
**7**. The dataset is now ready for training with your machine learning framework!

0 commit comments

Comments
 (0)