Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 33 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# ByteTrack
# YOLOv5 Implementation of ByteTrack

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/bytetrack-multi-object-tracking-by-1/multi-object-tracking-on-mot17)](https://paperswithcode.com/sota/multi-object-tracking-on-mot17?p=bytetrack-multi-object-tracking-by-1)

Expand Down Expand Up @@ -40,10 +40,10 @@ Multi-object tracking (MOT) aims at estimating bounding boxes and identities of
### 1. Installing on the host machine
Step1. Install ByteTrack.
```shell
git clone https://github.com/ifzhang/ByteTrack.git
cd ByteTrack
git clone https://github.com/parthmalpathak/ByteTrack_YOLOv5_v6.0.git
cd ByteTrack_YOLOv5_v6.0
pip3 install -r requirements.txt
python3 setup.py develop
pip3 install -e .
```

Step2. Install [pycocotools](https://github.com/cocodataset/cocoapi).
Expand Down Expand Up @@ -105,7 +105,7 @@ datasets
Then, you need to turn the datasets to COCO format and mix different training data:

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/convert_mot17_to_coco.py
python3 tools/convert_mot20_to_coco.py
python3 tools/convert_crowdhuman_to_coco.py
Expand All @@ -116,7 +116,7 @@ python3 tools/convert_ethz_to_coco.py
Before mixing different datasets, you need to follow the operations in [mix_xxx.py](https://github.com/ifzhang/ByteTrack/blob/c116dfc746f9ebe07d419caa8acba9b3acfa79a6/tools/mix_data_ablation.py#L6) to create a data folder and link. Finally, you can mix the training data:

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/mix_data_ablation.py
python3 tools/mix_data_test_mot17.py
python3 tools/mix_data_test_mot20.py
Expand Down Expand Up @@ -165,21 +165,23 @@ Train on CrowdHuman and MOT20, evaluate on MOT20 train.
|bytetrack_x_mot20 [[google]](https://drive.google.com/file/d/1HX2_JpMOjOIj1Z9rJjoet9XNy_cCAs5U/view?usp=sharing), [[baidu(code:3apd)]](https://pan.baidu.com/s/1bowJJj0bAnbhEQ3_6_Am0A) | 93.4 | 89.3 | 1057 | 17.5 |


NOTE: ```Training``` and ```Tracking``` is not required for ```Demo```. User can directly jump to ```Demo``` section to test results on YOLOx or YOLOv5 detectors.

## Training

The COCO pretrained YOLOX model can be downloaded from their [model zoo](https://github.com/Megvii-BaseDetection/YOLOX/tree/0.1.0). After downloading the pretrained models, you can put them under <ByteTrack_HOME>/pretrained.

* **Train ablation model (MOT17 half train and CrowdHuman)**

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/train.py -f exps/example/mot/yolox_x_ablation.py -d 8 -b 48 --fp16 -o -c pretrained/yolox_x.pth
```

* **Train MOT17 test model (MOT17 train, CrowdHuman, Cityperson and ETHZ)**

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/train.py -f exps/example/mot/yolox_x_mix_det.py -d 8 -b 48 --fp16 -o -c pretrained/yolox_x.pth
```

Expand All @@ -190,7 +192,7 @@ For MOT20, you need to clip the bounding boxes inside the image.
Add clip operation in [line 134-135 in data_augment.py](https://github.com/ifzhang/ByteTrack/blob/72cd6dd24083c337a9177e484b12bb2b5b3069a6/yolox/data/data_augment.py#L134), [line 122-125 in mosaicdetection.py](https://github.com/ifzhang/ByteTrack/blob/72cd6dd24083c337a9177e484b12bb2b5b3069a6/yolox/data/datasets/mosaicdetection.py#L122), [line 217-225 in mosaicdetection.py](https://github.com/ifzhang/ByteTrack/blob/72cd6dd24083c337a9177e484b12bb2b5b3069a6/yolox/data/datasets/mosaicdetection.py#L217), [line 115-118 in boxes.py](https://github.com/ifzhang/ByteTrack/blob/72cd6dd24083c337a9177e484b12bb2b5b3069a6/yolox/utils/boxes.py#L115).

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/train.py -f exps/example/mot/yolox_x_mix_mot20_ch.py -d 8 -b 48 --fp16 -o -c pretrained/yolox_x.pth
```

Expand All @@ -199,7 +201,7 @@ python3 tools/train.py -f exps/example/mot/yolox_x_mix_mot20_ch.py -d 8 -b 48 --
First, you need to prepare your dataset in COCO format. You can refer to [MOT-to-COCO](https://github.com/ifzhang/ByteTrack/blob/main/tools/convert_mot17_to_coco.py) or [CrowdHuman-to-COCO](https://github.com/ifzhang/ByteTrack/blob/main/tools/convert_crowdhuman_to_coco.py). Then, you need to create a Exp file for your dataset. You can refer to the [CrowdHuman](https://github.com/ifzhang/ByteTrack/blob/main/exps/example/mot/yolox_x_ch.py) training Exp file. Don't forget to modify get_data_loader() and get_eval_loader in your Exp file. Finally, you can train bytetrack on your dataset by running:

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/train.py -f exps/example/mot/your_exp_file.py -d 8 -b 48 --fp16 -o -c pretrained/yolox_x.pth
```

Expand All @@ -211,7 +213,7 @@ python3 tools/train.py -f exps/example/mot/your_exp_file.py -d 8 -b 48 --fp16 -o
Run ByteTrack:

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/track.py -f exps/example/mot/yolox_x_ablation.py -c pretrained/bytetrack_ablation.pth.tar -b 1 -d 1 --fp16 --fuse
```
You can get 76.6 MOTA using our pretrained model.
Expand All @@ -228,7 +230,7 @@ python3 tools/track_motdt.py -f exps/example/mot/yolox_x_ablation.py -c pretrain
Run ByteTrack:

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/track.py -f exps/example/mot/yolox_x_mix_det.py -c pretrained/bytetrack_x_mot17.pth.tar -b 1 -d 1 --fp16 --fuse
python3 tools/interpolation.py
```
Expand All @@ -241,7 +243,7 @@ We use the input size 1600 x 896 for MOT20-04, MOT20-07 and 1920 x 736 for MOT20
Run ByteTrack:

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/track.py -f exps/example/mot/yolox_x_mix_mot20_ch.py -c pretrained/bytetrack_x_mot20.pth.tar -b 1 -d 1 --fp16 --fuse --match_thresh 0.7 --mot20
python3 tools/interpolation.py
```
Expand Down Expand Up @@ -269,10 +271,22 @@ You can get the tracking results in each frame from 'online_targets'. You can re

<img src="assets/palace_demo.gif" width="600"/>

In order to test ByteTrack with YOLOx:

```shell
cd <ByteTrack_HOME>
cd ByteTrack_YOLOv5_v6.0
python3 tools/demo_track.py video -f exps/example/mot/yolox_x_mix_det.py -c pretrained/bytetrack_x_mot17.pth.tar --fp16 --fuse --save_result
```
In order to test ByteTrack with YOLOv5 version 6.0:

```shell
cd ByteTrack_YOLOv5_v6.0
python3 tools_yolov5/demo_track_yolov5_v6.py webcam -f exps/example/mot/yolov5_mix_det.py --save_result --ckpt pretrained/yolov5n6.pt
```
User can pass ```video``` or ```image``` instead of ```webcam``` and pass the path in ```demo_track_yolov5_v6.py```.

YOLOv5 Version 6 pretrained models can be downloaded from [YOLOv5 Model Zoo](https://github.com/ultralytics/yolov5/releases) under the ```Assets``` subsection of ```Version 6.1``` and ```Version 6.0```.


## Deploy

Expand All @@ -295,4 +309,8 @@ python3 tools/demo_track.py video -f exps/example/mot/yolox_x_mix_det.py -c pret

## Acknowledgement

A large part of the code is borrowed from [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX), [FairMOT](https://github.com/ifzhang/FairMOT), [TransTrack](https://github.com/PeizeSun/TransTrack) and [JDE-Cpp](https://github.com/samylee/Towards-Realtime-MOT-Cpp). Many thanks for their wonderful works.
1. A large part of the code is borrowed from [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX), [FairMOT](https://github.com/ifzhang/FairMOT)[TransTrack](https://github.com/PeizeSun/TransTrack) and [JDE-Cpp](https://github.com/samylee/Towards-Realtime-MOT-Cpp). Many thanks for their wonderful works.
2. Many thanks to [@ifzhang](https://github.com/ifzhang/ByteTrack) and [@gmt710](https://github.com/gmt710/yolov5ByteTrack) for their contributions to ByteTrack.

## Copyright
Adding YOLOv5 v6.0, v6.1 functionality to ByteTrack has been authored by [Parth Malpathak](https://github.com/parthmalpathak). Please mention proper citations while using this repository
138 changes: 138 additions & 0 deletions exps/example/mot/yolov5_mix_det.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# encoding: utf-8
import os
import random
import torch
import torch.nn as nn
import torch.distributed as dist

from yolox.exp import Exp as MyExp
from yolox.data import get_yolox_datadir

class Exp(MyExp):
def __init__(self):
super(Exp, self).__init__()
self.num_classes = 1
self.depth = 0.33
self.width = 0.50
self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
self.train_ann = "train.json"
self.val_ann = "test.json" #convert to train.json for training
self.input_size = (640, 640) #inpute size changed as compared to original code.
self.test_size = (640, 640)
self.random_size = (12, 26)
self.max_epoch = 80
self.print_interval = 20
self.eval_interval = 5
self.test_conf = 0.001
self.nmsthre = 0.7
self.no_aug_epochs = 10
self.basic_lr_per_img = 0.001 / 64.0
self.warmup_epochs = 1

def get_data_loader(self, batch_size, is_distributed, no_aug=False):
from yolox.data import (
MOTDataset,
TrainTransform,
YoloBatchSampler,
DataLoader,
InfiniteSampler,
MosaicDetection,
)

dataset = MOTDataset(
data_dir=os.path.join(get_yolox_datadir(), "mix_det"),
json_file=self.train_ann,
name='',
img_size=self.input_size,
preproc=TrainTransform(
rgb_means=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225),
max_labels=500,
),
)

dataset = MosaicDetection(
dataset,
mosaic=not no_aug,
img_size=self.input_size,
preproc=TrainTransform(
rgb_means=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225),
max_labels=1000,
),
degrees=self.degrees,
translate=self.translate,
scale=self.scale,
shear=self.shear,
perspective=self.perspective,
enable_mixup=self.enable_mixup,
)

self.dataset = dataset

if is_distributed:
batch_size = batch_size // dist.get_world_size()

sampler = InfiniteSampler(
len(self.dataset), seed=self.seed if self.seed else 0
)

batch_sampler = YoloBatchSampler(
sampler=sampler,
batch_size=batch_size,
drop_last=False,
input_dimension=self.input_size,
mosaic=not no_aug,
)

dataloader_kwargs = {"num_workers": self.data_num_workers, "pin_memory": True}
dataloader_kwargs["batch_sampler"] = batch_sampler
train_loader = DataLoader(self.dataset, **dataloader_kwargs)

return train_loader

def get_eval_loader(self, batch_size, is_distributed, testdev=False):
from yolox.data import MOTDataset, ValTransform

valdataset = MOTDataset(
data_dir=os.path.join(get_yolox_datadir(), "mot"),
json_file=self.val_ann,
img_size=self.test_size,
name='train',
preproc=ValTransform(
rgb_means=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225),
),
)

if is_distributed:
batch_size = batch_size // dist.get_world_size()
sampler = torch.utils.data.distributed.DistributedSampler(
valdataset, shuffle=False
)
else:
sampler = torch.utils.data.SequentialSampler(valdataset)

dataloader_kwargs = {
"num_workers": self.data_num_workers,
"pin_memory": True,
"sampler": sampler,
}
dataloader_kwargs["batch_size"] = batch_size
val_loader = torch.utils.data.DataLoader(valdataset, **dataloader_kwargs)

return val_loader

def get_evaluator(self, batch_size, is_distributed, testdev=False):
from yolox.evaluators import COCOEvaluator

val_loader = self.get_eval_loader(batch_size, is_distributed, testdev=testdev)
evaluator = COCOEvaluator(
dataloader=val_loader,
img_size=self.test_size,
confthre=self.test_conf,
nmsthre=self.nmsthre,
num_classes=self.num_classes,
testdev=testdev,
)
return evaluator
Loading