Dataset Preparation

🇷🇺 Русская версия

To train a sock detection model you need a labeled dataset — a set of photos with bounding boxes marking where the socks are in each image.

Capturing photos

Running the camera

Photos are captured directly from the Raspberry Pi camera. The script takes a series of shots with pauses so you can reposition the sock:

# On Raspberry Pi (requires sudo for camera access)
sudo ./main.py shot --count 200 --output-dir images

Parameters:

--count — number of shots (default 200)
--output-dir — output folder (default images/)
--pause — pause between shots in seconds (default 2)
--move-pause — pause every 10 shots to reposition the sock (default 10)

Tips for capturing

Variety of angles: capture socks from different sides, at different distances from the camera
Different lighting: daylight, artificial light, shadows
Different surfaces: floor, carpet, under furniture, on a couch
Different socks: various colors, sizes, folded and unfolded
Background: capture in real apartment conditions, not on a plain background
Quantity: more is better. Minimum 200 photos, optimal — 500+

Uploading to Roboflow

Roboflow is a platform for managing computer vision datasets. The free tier covers all our needs.

Sign up at roboflow.com
Create a new project: Object Detection, 1 class — sock
Upload all photos (Upload Data)

Annotation (labeling)

On each photo you need to draw a bounding box (rectangle) around every sock and assign the class sock.

Roboflow provides a built-in annotation editor:

Select the Bounding Box tool
Draw a box around each sock in the photo
Assign class sock
Repeat for all photos

Tip: if a sock is partially hidden (e.g., under a chair), still annotate the visible part. The model will learn to recognize partially visible socks.

Augmentation

Augmentation automatically creates image variations to increase the dataset. Roboflow can apply:

Rotation — random rotation ±15°
Crop — random crop up to 10% from each side
Brightness — brightness change ±20%
Blur — slight blur (up to 2.5 px)
Noise — adding noise (up to 3%)

When creating a new dataset version (Create New Version) Roboflow applies the selected augmentations and increases the number of training images.

Quick rule of thumb:

use Roboflow if you want the fastest path from photos to a trainable dataset;
use albumentations if you want a reproducible, code-based augmentation pipeline.

Alternative: local augmentation via Python (`albumentations`)

If you want a reproducible local pipeline instead of Roboflow, a common choice is albumentations. It is widely used for computer vision and supports bounding boxes.

Install:

pip install albumentations opencv-python

Example augmentation pipeline for a YOLO-style dataset:

import cv2
import albumentations as A

transform = A.Compose(
    [
        A.Rotate(limit=15, border_mode=cv2.BORDER_CONSTANT, p=0.5),
        A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
        A.GaussianBlur(blur_limit=(3, 5), p=0.2),
        A.GaussNoise(std_range=(0.02, 0.08), p=0.2),
        A.RandomCropFromBorders(crop_left=0.1, crop_right=0.1, crop_top=0.1, crop_bottom=0.1, p=0.2),
    ],
    bbox_params=A.BboxParams(format="yolo", label_fields=["class_labels"]),
)

image = cv2.imread("dataset/train/images/example.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

bboxes = [
    [0.52, 0.48, 0.30, 0.22],  # x_center, y_center, width, height (YOLO format)
]
class_labels = ["sock"]

augmented = transform(image=image, bboxes=bboxes, class_labels=class_labels)
augmented_image = augmented["image"]
augmented_bboxes = augmented["bboxes"]

When this approach is useful:

you want reproducible augmentations in code
you need to version-control the augmentation pipeline
you want to run the same transforms locally, in notebooks, or in CI

For this project, Roboflow is still the fastest path for dataset management, but albumentations is a good alternative if you prefer a code-based workflow.

Exporting the dataset

After creating a version — download the dataset in YOLOv8 format:

Click Download Dataset
Select format YOLOv8
Choose download zip to computer or show download code

Or download via terminal:

curl -L "https://app.roboflow.com/ds/YOUR_DATASET_URL" > roboflow.zip
unzip roboflow.zip
rm roboflow.zip

Dataset structure

After extraction the dataset has the following structure:

dataset/
├── train/
│   ├── images/        # Training images (~70%)
│   └── labels/        # Annotations in YOLO format (txt)
├── valid/
│   ├── images/        # Validation images (~20%)
│   └── labels/
├── test/
│   ├── images/        # Test images (~10%)
│   └── labels/
└── data.yaml          # Dataset config

The data.yaml file describes paths and classes:

train: dataset/train/images
val: dataset/valid/images
test: dataset/test/images

nc: 1
names: ['sock']

The current dataset version contains 961 images (Roboflow workspace: socks-axfcs, project: socks1, version 2).

← Previous	README	Next →
Raspberry Pi 4 Setup (legacy)	Back to README	Model Training

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset Preparation

Capturing photos

Running the camera

Tips for capturing

Uploading to Roboflow

Annotation (labeling)

Augmentation

Alternative: local augmentation via Python (`albumentations`)

Exporting the dataset

Dataset structure

FilesExpand file tree

dataset.md

Latest commit

History

dataset.md

File metadata and controls

Dataset Preparation

Capturing photos

Running the camera

Tips for capturing

Uploading to Roboflow

Annotation (labeling)

Augmentation

Alternative: local augmentation via Python (albumentations)

Exporting the dataset

Dataset structure

Alternative: local augmentation via Python (`albumentations`)