To train a sock detection model you need a labeled dataset — a set of photos with bounding boxes marking where the socks are in each image.
Photos are captured directly from the Raspberry Pi camera. The script takes a series of shots with pauses so you can reposition the sock:
# On Raspberry Pi (requires sudo for camera access)
sudo ./main.py shot --count 200 --output-dir imagesParameters:
--count— number of shots (default 200)--output-dir— output folder (defaultimages/)--pause— pause between shots in seconds (default 2)--move-pause— pause every 10 shots to reposition the sock (default 10)
- Variety of angles: capture socks from different sides, at different distances from the camera
- Different lighting: daylight, artificial light, shadows
- Different surfaces: floor, carpet, under furniture, on a couch
- Different socks: various colors, sizes, folded and unfolded
- Background: capture in real apartment conditions, not on a plain background
- Quantity: more is better. Minimum 200 photos, optimal — 500+
Roboflow is a platform for managing computer vision datasets. The free tier covers all our needs.
- Sign up at roboflow.com
- Create a new project: Object Detection, 1 class —
sock - Upload all photos (Upload Data)
On each photo you need to draw a bounding box (rectangle) around every sock and assign the class sock.
Roboflow provides a built-in annotation editor:
- Select the Bounding Box tool
- Draw a box around each sock in the photo
- Assign class
sock - Repeat for all photos
Tip: if a sock is partially hidden (e.g., under a chair), still annotate the visible part. The model will learn to recognize partially visible socks.
Augmentation automatically creates image variations to increase the dataset. Roboflow can apply:
- Rotation — random rotation ±15°
- Crop — random crop up to 10% from each side
- Brightness — brightness change ±20%
- Blur — slight blur (up to 2.5 px)
- Noise — adding noise (up to 3%)
When creating a new dataset version (Create New Version) Roboflow applies the selected augmentations and increases the number of training images.
Quick rule of thumb:
- use Roboflow if you want the fastest path from photos to a trainable dataset;
- use
albumentationsif you want a reproducible, code-based augmentation pipeline.
If you want a reproducible local pipeline instead of Roboflow, a common choice is albumentations. It is widely used for computer vision and supports bounding boxes.
Install:
pip install albumentations opencv-pythonExample augmentation pipeline for a YOLO-style dataset:
import cv2
import albumentations as A
transform = A.Compose(
[
A.Rotate(limit=15, border_mode=cv2.BORDER_CONSTANT, p=0.5),
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
A.GaussianBlur(blur_limit=(3, 5), p=0.2),
A.GaussNoise(std_range=(0.02, 0.08), p=0.2),
A.RandomCropFromBorders(crop_left=0.1, crop_right=0.1, crop_top=0.1, crop_bottom=0.1, p=0.2),
],
bbox_params=A.BboxParams(format="yolo", label_fields=["class_labels"]),
)
image = cv2.imread("dataset/train/images/example.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
bboxes = [
[0.52, 0.48, 0.30, 0.22], # x_center, y_center, width, height (YOLO format)
]
class_labels = ["sock"]
augmented = transform(image=image, bboxes=bboxes, class_labels=class_labels)
augmented_image = augmented["image"]
augmented_bboxes = augmented["bboxes"]When this approach is useful:
- you want reproducible augmentations in code
- you need to version-control the augmentation pipeline
- you want to run the same transforms locally, in notebooks, or in CI
For this project, Roboflow is still the fastest path for dataset management, but albumentations is a good alternative if you prefer a code-based workflow.
After creating a version — download the dataset in YOLOv8 format:
- Click Download Dataset
- Select format YOLOv8
- Choose download zip to computer or show download code
Or download via terminal:
curl -L "https://app.roboflow.com/ds/YOUR_DATASET_URL" > roboflow.zip
unzip roboflow.zip
rm roboflow.zipAfter extraction the dataset has the following structure:
dataset/
├── train/
│ ├── images/ # Training images (~70%)
│ └── labels/ # Annotations in YOLO format (txt)
├── valid/
│ ├── images/ # Validation images (~20%)
│ └── labels/
├── test/
│ ├── images/ # Test images (~10%)
│ └── labels/
└── data.yaml # Dataset config
The data.yaml file describes paths and classes:
train: dataset/train/images
val: dataset/valid/images
test: dataset/test/images
nc: 1
names: ['sock']The current dataset version contains 961 images (Roboflow workspace: socks-axfcs, project: socks1, version 2).
| ← Previous | README | Next → |
|---|---|---|
| Raspberry Pi 4 Setup (legacy) | Back to README | Model Training |