AI-TOD-v2/index.html at main · Chasel-Tsui/AI-TOD-v2 · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
<!doctype html>
<html>
<head>

	<title>AI-TOD-v2</title>
	<meta name="viewport" content="user-scalable=no, initial-scale=1, maximum-scale=1, minimum-scale=1">
	<link href="css/main.css" media="screen" rel="stylesheet" type="text/css"/>
	<link href="css/index.css" media="screen" rel="stylesheet" type="text/css"/>
	<link href='https://fonts.googleapis.com/css?family=Open+Sans:400,700' rel='stylesheet' type='text/css'>
	<link href='https://fonts.googleapis.com/css?family=Open+Sans+Condensed:400,700' rel='stylesheet' type='text/css'>
	<link href='https://fonts.googleapis.com/css?family=Raleway:400,600,700' rel='stylesheet' type='text/css'>
	<script type="text/x-mathjax-config">
			MathJax.Hub.Config({
			CommonHTML: { linebreaks: { automatic: true } },
			"HTML-CSS": { linebreaks: { automatic: true } },
				 SVG: { linebreaks: { automatic: true } }
			});
	</script>
	<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
	</script>

</head>

<body>

<div class="menu-container noselect">
	<div class="menu">
		<table class="menu-table">
			<tr>
				<td>
					<div class="logo">
						<a href="javascript:void(0)">AI-TOD-v2</a>
					</div>
				</td>
				<td>
					<div class="menu-items">
						<a class="menu-highlight">Overview</a>
						<a href="https://github.com/Chasel-Tsui/Detecting-Tiny-Objects-in-Aerial-Images">GitHub</a>
					</div>
				</td>
			</tr>
		</table>
	</div>
</div>

<div class="content-container">
	<div class="content">
		<table class="content-table">
		      <h1 style="text-align:center; margin-top:60px; font-weight: bold; font-size: 35px;">
				Detecting Tiny Objects in Aerial Images: A Normalized Wasserstein Distance and A New Benchmark </h1>

				<p style="text-align:center; margin-bottom:15px; margin-top:20px; font-size: 18px;">
					<a href="https://github.com/Chasel-Tsui" style="color: #0088CC">Chang Xu<script type="math/tex">^1</script>, </a>
					<a href="https://github.com/jwwangchn" style="color: #0088CC">Jinwang Wang<script type="math/tex">^1</script>, </a>
					<a href="http://www.captain-whu.com/yangwen_En.html" style="color: #0088CC">Wen Yang<script type="math/tex">^{1,*}</script>, </a>
          <a href="https://github.com/levenberg" style="color: #0088CC">Huai Yu<script type="math/tex">^1</script>,</a>
          <a href="http://dvs-whu.cn/" style="color: #0088CC">Lei Yu<script type="math/tex">^1</script>,</a>
					<a href="http://www.captain-whu.com/xia_En.html" style="color: #0088CC">Gui-Song Xia<script type="math/tex">^{2}</script>, </a> <br>
					<!-- <a href="http://www.dsi.unive.it/~pelillo/" style="color: #0088CC">Marcello Pelillo<script type="math/tex">^3</script></a> <br> -->

					<a target="_blank" href="https://dsp.whu.edu.cn/" style="color: #0088CC; font-style: italic"><script type="math/tex">^1</script>EIS SPL, Wuhan University, Wuhan, China</a><br>
					<a target="_blank" href="http://captain.whu.edu.cn/" style="color: #0088CC; font-style: italic"><script type="math/tex">^2</script>CAPTAIN, Wuhan University, Wuhan, China</a>  <br>

				</p>
				<p style="text-align:center; margin-bottom:15px; margin-top:20px; font-size: 18px;">
					<a href="https://www.sciencedirect.com/science/article/abs/pii/S0924271622001599?dgcid=author" style="color: hsl(17, 100%, 40%)"> [Paper]
					</a>
					<a href="https://github.com/Chasel-Tsui/mmdet-aitod" style="color: hsl(17, 100%, 40%)"> [Data and Code]
					</a>
				</p>


			<tr>
				<!-- <td colspan="1"> -->
					<h2 class="add-top-margin">Abstract</h2>
					<hr>
				<!-- </td> -->
			</tr>
			<tr>
					<!-- <td colspan="1"> -->

				<p class="text" style="text-align:justify;">
					Tiny object detection (TOD) in aerial images is a challenging task since a tiny object only contains a
          few pixels. State-of-the-art object detectors do not provide satisfactory results on tiny objects due to
          the lack of supervision from discriminative features. Our key observation is that the Intersection over
          Union (IoU) metric and its extensions are very sensitive to location deviation of the tiny objects, thus
          drastically deteriorating the quality of label assignment when used in anchor-based detectors. To this
          end, we propose a new evaluation metric dubbed Normalized Wasserstein Distance (NWD) and a new
          RanKing based Assigning (RKA) strategy for tiny object detection. The proposed NWD-RKA strategy
          can be easily embedded into all kinds of anchor-based detector to replace the standard IoU threshold
          based one, thereby greatly alleviating problems in label assignment and providing sufficient supervision
          information for network training. Tested on four datasets, NWD-RKA can consistently improve tiny
          object detection performance by a large margin. Besides, observing obvious noisy labels in the dataset
          for Tiny Object Detection in Aerial Images (AI-TOD), we are motivated to meticulously relabel AI-TOD
          and release AI-TOD-v2 and its corresponding benchmark. In AI-TOD-v2, the missing annotation and
          location error problems are considerably alleviated, facilitating more reliable training and validation
          processes. Embedding NWD-RKA into DetectoRS, the detection performance achieves 4.3 AP points
          improvement over state-of-the-art competitors on AI-TOD-v2.
				</p>
					<!-- </td> -->
			</tr>

			<tr>
				<!-- <td colspan="1"> -->
					<h2 class="add-top-margin">Introduction</h2>
					<hr>
				<!-- </td> -->
			</tr>

			<tr>
				<p class="text" style="text-align:justify;">
					Tiny objects are ubiquitous in aerial images, meanwhile detecting tiny objects in aerial images has many application
          scenarios, including vehicle detection, traffic condition monitoring
          and maritime rescue. Even though object detection has
          achieved significant progress thanks to the development of
          deep neural networks, most of them are dedicated to detecting objects
          of normal size. While tiny objects (less than 16*16 pixels
          in the AI-TOD dataset [1]) in aerial images often exhibit with extremely limited appearance information,
          which poses great challenges in learning discriminative features, leading to enormous failure cases when detecting tiny objects.
          <br>
          <br>
          For tiny objects, IoU metric suffers from the problem of sensivity to location deviation, and it fails to reflect the positional relationship of two bounding boxes when they have no overlap or are mutually inclusive, thus leading to three drawbacks in label assignment: severe pos/neg sample imbalance for tiny objects, scale-sample imbalance and failure of sample-compensation. In addition, we observe obvious noisy labels in the dataset for Tiny Object Detection in Aerial Images (AI-TOD), where the problem of objects missed to be annotated is very obvious. And the label noise problem will greatly affect the training and validation of tiny object detectors.
				</p>
			</tr>

			<tr>
				<td>
					<div>
					<img class="center" src="./images/fps2.gif" width="1000" /> <br>
					<p class="image-caption">
						Figure 1. A comparison between the AI-TOD and AI-TOD-v2. Note that the green boxes denote the same bounding boxes in the two versions, the yellow boxes denote the supplemented boxes while the red boxes denote deleted boxes.
					</p>
					</div>
				</td>
			</tr>

			<tr>
				<td>
				<p class="text" style="text-align:justify;">
					To address the aforementioned problems, we attempt to push forward the development of TOD in aerial images both from the aspect of method and data. Specifically, we propose a new metric dubbed Normalized Wasserstein Distance (NWD) and combine it with a customized RanKing-based training sample Assignment (RKA) strategy, simultaneously alleviating the above three drawbacks of IoU threshold based assignment. Moreover, we convene experts to meticulously relabel our preliminary TOD dataset AI-TOD and release AI-TOD-v2 along with its corrseponding benchmarks.
					Our contributions are three-folds:
					<ul style="font-weight:normal; text-align:justify;">
							<li> We propose NWD-RKA as a better training sample assignment strategy for tiny objects, which can simultaneously alleviate the three drawbacks of IoU threshold based strategy (i.e., severe pos/neg sample imbalance for tiny objects, scale-sample imbalance and failure of sample-compensation). </li>
							<li> We carefully relabel AI-TOD and release AI-TOD-v2, where the label noise problem is greatly mitigated. Besides, we establish a comprehensive benchmark by several baseline detectors. The training/validation images and annotations will be made public. </li>
							<li> Our proposed NWD-RKA can be applied to all kinds of anchor-based detectors and boost their performance on tiny objects. On AI-TOD-v2 dataset, we achieve the performance of 24.7% AP and 57.2% AP0.5, which is highly above the state-of-the-art competitors. Moreover, significant improvements can be seen on AI-TOD, VisDrone2019 [2] and DOTA-v2.0 [3] dataset.</li>
					</ul>

				</p>
				</td>
			</tr>
      <tr>
				<td>
						<h2 class="add-top-margin">AI-TOD-v2</h2>
						<hr>
				</td>
			</tr>
      <tr>
        <td>
            <h3 class="add-top-margin">Statistics</h3>

        </td>
    </tr>

    <tr>
      <td>
        <div>
        <img class="center" src="images/statistics.PNG" width="1000" /> <br>
        <p class="image-caption">
          Figure 2. Statistics of classes and instances in AI-TOD-v2. (a) Histogram of the number of instances per class. (b) Histogram of number of instances per image. (c) Histogram of number of instances’ sizes. (d) Boxplot depicting the range of sizes for each object category.
        </p>
        </div>
      </td>
    </tr>

	<tr>
		<td>
		  <div>
		  <img class="center" src="images/mean.PNG" width="500" /> <br>
		  <p class="image-caption">
			Table 1. Mean and standard deviation of object scale on different
			datasets.
		  </p>
		  </div>
		</td>
	  </tr>

	  <tr>
        <td>
            <h3 class="add-top-margin">Download</h3>
		1. AI-TOD [<a href="https://github.com/jwwangchn/AI-TOD" style="color: #0088CC">images & annotations</a>]<br>
		2. AI-TOD-v2
		[<a href="https://drive.google.com/drive/folders/1Er14atDO1cBraBD4DSFODZV1x7NHO_PY?usp=sharing" style="color: #0088CC">annotations</a>]<br>
<!--             1. AI-TOD
			[<a href="https://drive.google.com/drive/folders/1mokzFtLCjyqalSEajYTUmyzXvOHAa4WX" style="color: #0088CC">images & annotations</a>]<br>
			2. AI-TOD-v2
			[<a href="https://drive.google.com/drive/folders/1mokzFtLCjyqalSEajYTUmyzXvOHAa4WX" style="color: #0088CC">images & annotations</a>]<br>
			Note that the AI-TOD and AI-TOD-v2 share the image sets. -->
        </td>
     </tr>

			<tr>
				<td>
						<h2 class="add-top-margin">Experimental Results</h2>
						<hr>
				</td>
			</tr>

			<tr>
					<td>
							<h3 class="add-top-margin">A Comparison of Different Methods on AI-TOD-v2</h3>

					</td>
			</tr>

			<tr>
				<td>
					<div>
					<img class="center" src="images/visu.png" width="1000" /> <br>
					<p class="image-caption">
						Figure 3. Visualization of detection results using baseline detectors (first 4 rows) and NWD-RKA based detector (5th row) of AI-TOD-v2 dataset. The green, blue and red boxes denote true positive (TP), false positive (FP) and false negative (FN) predictions, respectively. Note that we apply NWD+RKA into DetectoRS in the 5th row.
					</p>
					</div>
				</td>
			</tr>

			<tr>
				<td>
						<h2 class="add-top-margin">Related Work</h2>
						<hr>
						<p class="text" style="text-align:justify;"></p>
						Our work is built based on our past works including: <a href="https://ieeexplore.ieee.org/abstract/document/9413340" style="color: #0088CC">AI-TOD</a> [1], <a href="https://arxiv.org/abs/2102.12219" style="color: #0088CC">DOTA-v2.0</a> [3], <a href="https://openaccess.thecvf.com/content/CVPR2021W/EarthVision/html/Xu_Dot_Distance_for_Tiny_Object_Detection_in_Aerial_Images_CVPRW_2021_paper.html" style="color: #0088CC">DotD</a> [4] and <a href="https://arxiv.org/abs/2110.13389" style="color: #0088CC">NWD</a> [5]. If you found our work helpful, consider citing these works as well.
				</td>
			</tr>
			<tr>
					<td>
							<h2 class="add-top-margin">Acknowledgements</h2>
							<hr>
							<p class="text" style="text-align:justify;"></p>
						We would like to thank Lanxin Zeng, Xiangyuan Wang,
						Hanqian Si and Yan Zhang for the voluntary relabeling
						work; Haowen Guo and Ruixiang Zhang for helpful feedback
						and discussions. This project has received funding from the
						National Natural Science Foundation of China (NSFC) under
						Grant 61771351. The numerical calculations were conducted
						on the supercomputing system in the Supercomputing Center,
						Wuhan University.

					</td>
				</tr>
			<tr>
					<td>
							<h2 class="add-top-margin">Citation</h2>
							<hr>
							<ol style="padding-inline-start:20px;">
						<li>
							<b>Detecting Tiny Objects in Aerial Images: A Normalized Wasserstein Distance and A New Benchmark</b>
							[<a href="https://www.sciencedirect.com/science/article/abs/pii/S0924271622001599" style="color: #0088CC">paper</a>]<br>
								C. Xu, J. Wang, W. Yang, H. Yu, L. Yu, G.-S. Xia <br>
								ISPRS Journal of Photogrammetry and Remote Sensing (<b>ISPRS J P & RS</b>), Vol.190, Pages:79-93, 2022.
						</li>

					</td>
				</tr>

			<tr>
					<td colspan="2">
					<h2 class="add-top-margin">Main References</h2>
					<hr>
					<ol style="padding-inline-start:20px;">
						<li>
							<b>Tiny Object Detection in Aerial Images </b>
							[<a href="https://ieeexplore.ieee.org/abstract/document/9413340" style="color: #0088CC">paper</a>]<br>
								J. Wang, W. Yang, H. Guo, R. Zhang, G.-S. Xia <br>
								IEEE Conference on Pattern Recognition (<b>ICPR</b>), 2021.
						</li>
						<li>
							<b>VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results </b>
							[<a href="https://arxiv.org/abs/2102.12219" style="color: #0088CC">paper</a>]<br>
							   Du. D, Zhu. P, Wen. L, et al. <br>
							   IEEE Conference on Computer Vision and Pattern Recognition Workshops (<b>CVPRW</b>), 2019.
						</li>
						<li>
							<b>Object Detection in Aerial Images: A Large-scale Benchmark and Challenges </b>
							[<a href="https://arxiv.org/abs/2102.12219" style="color: #0088CC">paper</a>]<br>
							   J. Ding, N. Xue, G.-S. Xia, X. Bai, W. Yang, et al. <br>
							   IEEE Trans. on Pattern Analysis and Machine Intelligence (<b>TPAMI</b>), 2021.
						</li>

						<li>
								<b>xview: Objects in context in overhead imagery</b>
								[<a href="http://xviewdataset.org/" style="color: #0088CC">paper</a>]<br>
								L. Darius, K. Richard, M. Kevin, et al. <br>
								<b>arXiv</b>, 2018.
						</li>

						<li>
								<b>Dot Distance for Tiny Object Detection in Aerial Images</b>
								[<a href="https://openaccess.thecvf.com/content/CVPR2021W/EarthVision/html/Xu_Dot_Distance_for_Tiny_Object_Detection_in_Aerial_Images_CVPRW_2021_paper.html" style="color: #0088CC">paper</a>]<br>
								C. Xu, J. Wang, W. Yang, L, Yu <br>
								IEEE Conference on Computer Vision and Pattern Recognition Workshops (<b>CVPRW</b>), 2021.
						</li>

						<li>
								<b>A Normalized Gaussian Wasserstein Distance for Tiny Object Detection</b>
								[<a href="https://arxiv.org/abs/2110.13389" style="color: #0088CC">paper</a>]<br>
								J. Wang, C. Xu, W. Yang, L. Yu <br>
							    <b>arXiv</b>, 2021.
						</li>


					</ol>
					</td>
				</tr>


			<br><br>
		 </table>


	<div class="footer">
		<p class="block">&copy; 2021 by Chang Xu at SPL</p>
	</div>

	</div>
</div>
</body>
</html>