|
1 | | -# SSD: Single Shot MultiBox Detector |
2 | 1 |
|
3 | | -[](https://travis-ci.org/weiliu89/caffe) |
4 | | -[](LICENSE) |
5 | | - |
6 | | -Caffe is a deep learning framework made with expression, speed, and modularity in mind. |
7 | | -It is developed by Berkeley AI Research ([BAIR](http://bair.berkeley.edu))/The Berkeley Vision and Learning Center (BVLC) and community contributors. |
8 | | - |
9 | | -SSD support by [Wei Liu](http://www.cs.unc.edu/~wliu/), [Dragomir Anguelov](https://www.linkedin.com/in/dragomiranguelov), [Dumitru Erhan](http://research.google.com/pubs/DumitruErhan.html), [Christian Szegedy](http://research.google.com/pubs/ChristianSzegedy.html), [Scott Reed](http://www-personal.umich.edu/~reedscot/), [Cheng-Yang Fu](http://www.cs.unc.edu/~cyfu/), [Alexander C. Berg](http://acberg.com). |
10 | | - |
11 | | -### Introduction |
12 | | - |
13 | | -- [DIY Deep Learning for Vision with Caffe](https://docs.google.com/presentation/d/1UeKXVgRvvxg9OUdh_UiC5G71UMscNPlvArsWER41PsU/edit#slide=id.p) |
14 | | -- [Tutorial Documentation](http://caffe.berkeleyvision.org/tutorial/) |
15 | | -- [BAIR reference models](http://caffe.berkeleyvision.org/model_zoo.html) and the [community model zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo) |
16 | | -- [Installation instructions](http://caffe.berkeleyvision.org/installation.html) |
17 | | - |
18 | | -SSD is an unified framework for object detection with a single network. You can use the code to train/evaluate a network for object detection task. For more details, please refer to our [arXiv paper](http://arxiv.org/abs/1512.02325) and our [slide](http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf). |
19 | | - |
20 | | -<p align="center"> |
21 | | -<img src="http://www.cs.unc.edu/~wliu/papers/ssd.png" alt="SSD Framework" width="600px"> |
22 | | -</p> |
23 | | - |
24 | | -## Custom distributions |
25 | | - |
26 | | - - [Intel Caffe](https://github.com/BVLC/caffe/tree/intel) (Optimized for CPU and support for multi-node), in particular Xeon processors (HSW, BDW, SKX, Xeon Phi). |
27 | | -- [OpenCL Caffe](https://github.com/BVLC/caffe/tree/opencl) e.g. for AMD or Intel devices. |
28 | | -- [Windows Caffe](https://github.com/BVLC/caffe/tree/windows) |
29 | | - |
30 | | -## Community |
31 | | - |
32 | | -[](https://gitter.im/BVLC/caffe?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) |
33 | | - |
34 | | -SSD: |
35 | | -| System | VOC2007 test *mAP* | **FPS** (Titan X) | Number of Boxes | Input resolution |
36 | | -|:-------|:-----:|:-------:|:-------:|:-------:| |
37 | | -| [Faster R-CNN (VGG16)](https://github.com/ShaoqingRen/faster_rcnn) | 73.2 | 7 | ~6000 | ~1000 x 600 | |
38 | | -| [YOLO (customized)](http://pjreddie.com/darknet/yolo/) | 63.4 | 45 | 98 | 448 x 448 | |
39 | | -| SSD300* (VGG16) | 77.2 | 46 | 8732 | 300 x 300 | |
40 | | -| SSD512* (VGG16) | **79.8** | 19 | 24564 | 512 x 512 | |
41 | | - |
42 | | - |
43 | | -<p align="left"> |
44 | | -<img src="http://www.cs.unc.edu/~wliu/papers/ssd_results.png" alt="SSD results on multiple datasets" width="800px"> |
45 | | -</p> |
46 | | - |
47 | | -_Note: SSD300* and SSD512* are the latest models. Current code should reproduce these results._ |
48 | | - |
49 | | -Caffe is released under the [BSD 2-Clause license](https://github.com/BVLC/caffe/blob/master/LICENSE). |
50 | | -The BAIR/BVLC reference models are released for unrestricted use. |
51 | | - |
52 | | -### Citing SSD |
53 | | - |
54 | | -Please cite SSD in your publications if it helps your research: |
55 | | - |
56 | | - @inproceedings{liu2016ssd, |
57 | | - title = {{SSD}: Single Shot MultiBox Detector}, |
58 | | - author = {Liu, Wei and Anguelov, Dragomir and Erhan, Dumitru and Szegedy, Christian and Reed, Scott and Fu, Cheng-Yang and Berg, Alexander C.}, |
59 | | - booktitle = {ECCV}, |
60 | | - year = {2016} |
61 | | - } |
62 | | - |
63 | | -### Contents |
64 | | -1. [Installation](#installation) |
65 | | -2. [Preparation](#preparation) |
66 | | -3. [Train/Eval](#traineval) |
67 | | -4. [Models](#models) |
68 | | - |
69 | | -### Installation |
70 | | -1. Get the code. We will call the directory that you cloned Caffe into `$CAFFE_ROOT` |
71 | | - ```Shell |
72 | | - git clone https://github.com/weiliu89/caffe.git |
73 | | - cd caffe |
74 | | - git checkout ssd |
75 | | - ``` |
76 | | - |
77 | | -2. Build the code. Please follow [Caffe instruction](http://caffe.berkeleyvision.org/installation.html) to install all necessary packages and build it. |
78 | | - ```Shell |
79 | | - # Modify Makefile.config according to your Caffe installation. |
80 | | - cp Makefile.config.example Makefile.config |
81 | | - make -j8 |
82 | | - # Make sure to include $CAFFE_ROOT/python to your PYTHONPATH. |
83 | | - make py |
84 | | - make test -j8 |
85 | | - # (Optional) |
86 | | - make runtest -j8 |
87 | | - ``` |
88 | | - |
89 | | -### Preparation |
90 | | -1. Download [fully convolutional reduced (atrous) VGGNet](https://gist.github.com/weiliu89/2ed6e13bfd5b57cf81d6). By default, we assume the model is stored in `$CAFFE_ROOT/models/VGGNet/` |
91 | | - |
92 | | -2. Download VOC2007 and VOC2012 dataset. By default, we assume the data is stored in `$HOME/data/` |
93 | | - ```Shell |
94 | | - # Download the data. |
95 | | - cd $HOME/data |
96 | | - wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar |
97 | | - wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar |
98 | | - wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar |
99 | | - # Extract the data. |
100 | | - tar -xvf VOCtrainval_11-May-2012.tar |
101 | | - tar -xvf VOCtrainval_06-Nov-2007.tar |
102 | | - tar -xvf VOCtest_06-Nov-2007.tar |
103 | | - ``` |
104 | | - |
105 | | -3. Create the LMDB file. |
106 | | - ```Shell |
107 | | - cd $CAFFE_ROOT |
108 | | - # Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/ |
109 | | - ./data/VOC0712/create_list.sh |
110 | | - # You can modify the parameters in create_data.sh if needed. |
111 | | - # It will create lmdb files for trainval and test with encoded original image: |
112 | | - # - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb |
113 | | - # - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb |
114 | | - # and make soft links at examples/VOC0712/ |
115 | | - ./data/VOC0712/create_data.sh |
116 | | - ``` |
117 | | - |
118 | | -### Train/Eval |
119 | | -1. Train your model and evaluate the model on the fly. |
120 | | - ```Shell |
121 | | - # It will create model definition files and save snapshot models in: |
122 | | - # - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/ |
123 | | - # and job file, log file, and the python script in: |
124 | | - # - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/ |
125 | | - # and save temporary evaluation results in: |
126 | | - # - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/ |
127 | | - # It should reach 77.* mAP at 120k iterations. |
128 | | - python examples/ssd/ssd_pascal.py |
129 | | - ``` |
130 | | - If you don't have time to train your model, you can download a pre-trained model at [here](https://drive.google.com/open?id=0BzKzrI_SkD1_WVVTSmQxU0dVRzA). |
131 | | - |
132 | | -2. Evaluate the most recent snapshot. |
133 | | - ```Shell |
134 | | - # If you would like to test a model you trained, you can do: |
135 | | - python examples/ssd/score_ssd_pascal.py |
136 | | - ``` |
137 | | - |
138 | | -3. Test your model using a webcam. Note: press <kbd>esc</kbd> to stop. |
139 | | - ```Shell |
140 | | - # If you would like to attach a webcam to a model you trained, you can do: |
141 | | - python examples/ssd/ssd_pascal_webcam.py |
142 | | - ``` |
143 | | - [Here](https://drive.google.com/file/d/0BzKzrI_SkD1_R09NcjM1eElLcWc/view) is a demo video of running a SSD500 model trained on [MSCOCO](http://mscoco.org) dataset. |
144 | | - |
145 | | -4. Check out [`examples/ssd_detect.ipynb`](https://github.com/weiliu89/caffe/blob/ssd/examples/ssd_detect.ipynb) or [`examples/ssd/ssd_detect.cpp`](https://github.com/weiliu89/caffe/blob/ssd/examples/ssd/ssd_detect.cpp) on how to detect objects using a SSD model. Check out [`examples/ssd/plot_detections.py`](https://github.com/weiliu89/caffe/blob/ssd/examples/ssd/plot_detections.py) on how to plot detection results output by ssd_detect.cpp. |
146 | | - |
147 | | -5. To train on other dataset, please refer to data/OTHERDATASET for more details. We currently add support for COCO and ILSVRC2016. We recommend using [`examples/ssd.ipynb`](https://github.com/weiliu89/caffe/blob/ssd/examples/ssd_detect.ipynb) to check whether the new dataset is prepared correctly. |
148 | | - |
149 | | -### Models |
150 | | -We have provided the latest models that are trained from different datasets. To help reproduce the results in [Table 6](https://arxiv.org/pdf/1512.02325v4.pdf), most models contain a pretrained `.caffemodel` file, many `.prototxt` files, and python scripts. |
151 | | - |
152 | | -1. PASCAL VOC models: |
153 | | - * 07+12: [SSD300*](https://drive.google.com/open?id=0BzKzrI_SkD1_WVVTSmQxU0dVRzA), [SSD512*](https://drive.google.com/open?id=0BzKzrI_SkD1_ZDIxVHBEcUNBb2s) |
154 | | - * 07++12: [SSD300*](https://drive.google.com/open?id=0BzKzrI_SkD1_WnR2T1BGVWlCZHM), [SSD512*](https://drive.google.com/open?id=0BzKzrI_SkD1_MjFjNTlnempHNWs) |
155 | | - * COCO<sup>[1]</sup>: [SSD300*](https://drive.google.com/open?id=0BzKzrI_SkD1_NDlVeFJDc2tIU1k), [SSD512*](https://drive.google.com/open?id=0BzKzrI_SkD1_TW4wTC14aDdCTDQ) |
156 | | - * 07+12+COCO: [SSD300*](https://drive.google.com/open?id=0BzKzrI_SkD1_UFpoU01yLS1SaG8), [SSD512*](https://drive.google.com/open?id=0BzKzrI_SkD1_X3ZXQUUtM0xNeEk) |
157 | | - * 07++12+COCO: [SSD300*](https://drive.google.com/open?id=0BzKzrI_SkD1_TkFPTEQ1Z091SUE), [SSD512*](https://drive.google.com/open?id=0BzKzrI_SkD1_NVVNdWdYNEh1WTA) |
158 | | - |
159 | | -2. COCO models: |
160 | | - * trainval35k: [SSD300*](https://drive.google.com/open?id=0BzKzrI_SkD1_dUY1Ml9GRTFpUWc), [SSD512*](https://drive.google.com/open?id=0BzKzrI_SkD1_dlJpZHJzOXd3MTg) |
161 | | - |
162 | | -3. ILSVRC models: |
163 | | - * trainval1: [SSD300*](https://drive.google.com/open?id=0BzKzrI_SkD1_a2NKQ2d1d043VXM), [SSD500](https://drive.google.com/open?id=0BzKzrI_SkD1_X2ZCLVgwLTgzaTQ) |
164 | | - |
165 | | -<sup>[1]</sup>We use [`examples/convert_model.ipynb`](https://github.com/weiliu89/caffe/blob/ssd/examples/convert_model.ipynb) to extract a VOC model from a pretrained COCO model. |
0 commit comments