@@ -78,44 +78,59 @@ you can use darknet2pytorch to convert it yourself, or download my converted mod
7878
7979# 2. Inference (Evolving)
8080
81- - Performance on MS COCO val2017 set (using pretrained DarknetWeights from <https://github.com/AlexeyAB/darknet>)
81+ ## 2.1 416 * 416 Performance on MS COCO dataset (using pretrained DarknetWeights from <https://github.com/AlexeyAB/darknet>)
8282
83- | Model type | AP IoU=0.50:0.95 | AP IoU=0.50 | AR 0.50:0.95 |
84- | ---------- | ---------------: | ----------: | -----------: |
85- | Pytorch | 0.466 | 0.704 | 0.591 |
86- | ONNX | incoming | incoming | incoming |
87- | TensorRT | incoming | incoming | incoming |
83+ **ONNX and TensorRT models are converted from Pytorch (TianXiaomo): Pytorch->ONNX->TensorRT.**
84+ See following sections for more details of conversions.
8885
89- - Image input size for inference
86+ - val2017 dataset
9087
91- Image input size is NOT restricted in `320 * 320`, `416 * 416`, `512 * 512` and `608 * 608`.
92- You can adjust your input sizes for a different input ratio, for example: `320 * 608`.
93- Larger input size could help detect smaller targets, but may be slower and GPU memory exhausting.
88+ | Model type | AP | AP50 | AP75 | APS | APM | APL |
89+ | ------------------- | ----------: | ----------: | ----------: | ----------: | ----------: | ----------: |
90+ | Pytorch (TianXiaomo)| 0.466 | 0.704 | 0.505 | 0.267 | 0.524 | 0.629 |
91+ | ONNX | incoming | incoming | incoming | incoming | incoming | incoming |
92+ | TensorRT FP32 + BatchedNMSPlugin | 0.472| 0.708 | 0.511 | 0.273 | 0.530 | 0.637 |
93+ | TensorRT FP16 + BatchedNMSPlugin | 0.472| 0.708 | 0.511 | 0.273 | 0.530 | 0.636 |
94+
95+ - testdev2017 dataset
96+
97+ | Model type | AP | AP50 | AP75 | APS | APM | APL |
98+ | ------------------- | ----------: | ----------: | ----------: | ----------: | ----------: | ----------: |
99+ | DarkNet (YOLOv4 paper)| 0.412 | 0.628 | 0.443 | 0.204 | 0.444 | 0.560 |
100+ | Pytorch (TianXiaomo)| 0.404 | 0.615 | 0.436 | 0.196 | 0.438 | 0.552 |
101+ | ONNX | incoming | incoming | incoming | incoming | incoming | incoming |
102+ | TensorRT FP32 + BatchedNMSPlugin | 0.412| 0.625 | 0.445 | 0.200 | 0.446 | 0.564 |
103+
104+ ## 2.2 Image input size for inference
105+
106+ Image input size is NOT restricted in `320 * 320`, `416 * 416`, `512 * 512` and `608 * 608`.
107+ You can adjust your input sizes for a different input ratio, for example: `320 * 608`.
108+ Larger input size could help detect smaller targets, but may be slower and GPU memory exhausting.
94109
95110 ```py
96111 height = 320 + 96 * n, n in {0, 1, 2, 3, ...}
97112 width = 320 + 96 * m, m in {0, 1, 2, 3, ...}
98113 ```
99114
100- - **Different inference options**
115+ ## 2.3 **Different inference options**
101116
102- - Load the pretrained darknet model and darknet weights to do the inference (image size is configured in cfg file already)
117+ - Load the pretrained darknet model and darknet weights to do the inference (image size is configured in cfg file already)
103118
104- ```sh
105- python demo.py -cfgfile <cfgFile> -weightfile <weightFile> -imgfile <imgFile>
106- ```
119+ ```sh
120+ python demo.py -cfgfile <cfgFile> -weightfile <weightFile> -imgfile <imgFile>
121+ ```
107122
108- - Load pytorch weights (pth file) to do the inference
123+ - Load pytorch weights (pth file) to do the inference
109124
110- ```sh
111- python models.py <num_classes> <weightfile> <imgfile> <IN_IMAGE_H> <IN_IMAGE_W> <namefile(optional)>
112- ```
125+ ```sh
126+ python models.py <num_classes> <weightfile> <imgfile> <IN_IMAGE_H> <IN_IMAGE_W> <namefile(optional)>
127+ ```
113128
114- - Load converted ONNX file to do inference (See section 3 and 4)
129+ - Load converted ONNX file to do inference (See section 3 and 4)
115130
116- - Load converted TensorRT engine file to do inference (See section 5)
131+ - Load converted TensorRT engine file to do inference (See section 5)
117132
118- - Inference output
133+ ## 2.4 Inference output
119134
120135 There are 2 inference outputs.
121136 - One is locations of bounding boxes, its shape is `[batch, num_boxes, 1, 4]` which represents x1, y1, x2, y2 of each bounding box.
0 commit comments