TinyYolo

If you like tiny code, Pytorch and Yolo, then you'll like TinyYolo.

What this is

This repo uses the new "Tiny-Oriented-Programming" paradigm invented by TinyGrad to implement a set of popular Yolo models and sample assignment algorithms.
No YAML files, just (hopefully) good, modular, readable, minimal yet complete code.
This is a library, not a framework.
This library is for developers more so than for users.

What's provided

models.py: contains all the Yolo models, which automatically calculate loss when targets are provided in forward function.
assigner.cpp: contains various sample assignment algorithms written in pure Pytorch C++
test.py: tests the models with pretrained weights from darknet, ultralytics, Yolov6 and Yolov7 official repos.
train_coco.py: an example training script that trains on COCO using lightning

Models

Assigners

Example

nc  = ... # number of classes, e.g. 80 for COCO
net = Yolov3(nc, spp=True).eval()
# net = Yolov4(nc).eval()
# net = Yolov5('n', nc).eval()
# net = Yolov6('n', nc).eval()
# net = Yolov7(nc).eval()
# net = Yolov8('n', nc).eval()
# net = Yolov10('n', nc).eval()
# net = Yolov11('n', nc).eval()
# net = Yolov12('n', nc).eval()
# net = Yolov26('n', nc).eval()

# Inference only
B = ... # Batch size
H = ... # Height
W = ... # Width
x = torch.randn(B, 3, H, W) # image-like
preds = net(x) # preds.shape == [B, N, nc+5] or nc+4 if there's no objectness feature e.g V5, V6, V8, V10

# Train
D = ... # max number of target detections per batch
C = 5   # target features [x1,y1,x2,y2,cls] where x1,x2 ∈ [0,W], y1,y2 ∈ [0,H], cls ∈ [-1,nc) and -1 is used for padding
targets = torch.randn(B, D, C)
preds, loss_dict = net(x, targets)

Export

ONNX

Export it...

net = Yolov3(80, spp=True).eval()

x = torch.randn(4, 3, 640, 640)
_ = net(x) # compile all the einops kernels. Required before ONNX export
torch.onnx.export(net, (x,), '/tmp/model.onnx',
                  input_names=['img'], output_names=['preds'],
                  dynamic_axes={'img'   : {0: 'B', 2: 'H', 3: 'W'},
                                'preds' : {0: 'B', 1: 'N'}})

Run it...

Install dependencies:

pip install numpy onnxruntime

import onnxruntime as ort
import numpy as np

net    = ort.InferenceSession('/tmp/model.onnx', providers=['CPUExecutionProvider'])
x      = np.random.randn((1, 3, 576, 768))
preds, = net.run(None, {'img': x})

Compile it...

Download onnxmlir and use the onnx-mlir.py script.

python3 onnx-mlir.py --EmitObj -O3 /tmp/model.onnx -o model

TFLite

Convert it...

Install dependencies:

pip install onnx2tf tensorflow tf_keras onnx_graphsurgeon sng4onnx onnxsim

onnx2tf -i /tmp/model.onnx -ois "img:1,3,640,640" -o /tmp/model

Notes

I advise using lightning or accelerate to write training scripts. They take care of everything including distributed training, FP16, checkpointing, etc.
The sample assignment algorithms are written in Pytorch C++ for simplicity. Indeed, when I first wrote them in Python, 90% of the complexity was in crazy tensor indexing and masking. This was destracting and annoying. In C++ you can use for-loops and if-statements without loss of performance. The algorithms are much more readable now. The only drawback is that you have to put tensors back onto CPU, perform the algorithm, then put the returned tensors (target boxes, scores and classes) back onto target device, usually GPU.

Observations

Pretty much all official models use eps=0.001 and momentum=0.03 in nn.Batchnorm2d. Those aren't the Pytorch defaults. Where do those numbers come from?
From what I can tell the main innovation in yolov6 is the distillation loss in bounding box regression: there are two branches for bounding box, one with DFL and one without. AFAIK, only the DFL one gets used in forward pass. During training, both get CIOU loss-ed.
onnx-mlir is very slow and runs on 1 thread only. So onnxruntime is better for inferrence.
Yolov12's area attention is a bit misleading. It sounds like tiled attention, but no, more like rows.
Yolov12 doesn't have as much attention as you think. Only 1 extra layer of attention compared to Yolov11. The head has no attention. Look at the args. So what's the big deal.
In my opinion there is little innovation in the new Yolo models. It's just tweaks.
Research should go into training recipes rather than chasing MAP scores. Come up with a way to train a model on COCO in under 5 epochs.
Very little model architecture changes between Yolo11 and Yolo26. SPPF has shortcut and there is a sprinkle more attention.
My understand of end2end training (YoloV10 and Yolo26) is you have two sets of normal V8 regression losses each with its own TAL assigner: one for one2many with topk set to 10 and another for one2one with topk set to 1. You minimise both losses with weights of 0.8 for one2many and 0.2 for one2one. These weights are then decayed and inversely decayed respectively. The idea is that the one2many branch gives lots of positive signal to the gradients which helps with training and helps refine the one2one branch. I think that's it.
A good implementation of NMS in say C++ can be very fast. Not convinced removing NMS gives that much of a boost in performance as the ultralytics authors claim. Not convinced end2end training is necessary.

TODO

Train everything (probably going to need some cloud compute (help))
Train with mixed precision

License

This project is licensed under the MIT License. See the LICENSE file for details.

Pretrained weights downloaded by helper scripts are subject to their own licenses. See THIRD_PARTY_NOTICES.md for details.

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
.github/workflows		.github/workflows
images		images
src		src
LICENSE		LICENSE
README.md		README.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
download_weights.sh		download_weights.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TinyYolo

What this is

What's provided

Models

Assigners

Example

Export

ONNX

TFLite

Notes

Observations

TODO

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Languages

License

pfeatherstone/tinyyolo

Folders and files

Latest commit

History

Repository files navigation

TinyYolo

What this is

What's provided

Models

Assigners

Example

Export

ONNX

TFLite

Notes

Observations

TODO

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Languages

Packages