|
3 | 3 | Transforms on Rotated Bounding Boxes
|
4 | 4 | ===============================================================
|
5 | 5 |
|
6 |
| -This example illustrates how to define and use rotated bounding boxes. We'll |
7 |
| -cover how to define them, demonstrate their usage with some of the existing |
8 |
| -transforms, and finally some of their unique behavior in comparision to |
9 |
| -standard bounding boxes. |
| 6 | +This example illustrates how to define and use rotated bounding boxes. |
| 7 | +
|
| 8 | +.. note:: |
| 9 | + Support for rotated bounding boxes was released in TorchVision 0.23 and is |
| 10 | + currently a BETA feature. We don't expect the API to change, but there may |
| 11 | + be some rare edge-cases. If you find any issues, please report them on |
| 12 | + our bug tracker: https://github.com/pytorch/vision/issues?q=is:open+is:issue |
10 | 13 |
|
11 | 14 | First, a bit of setup code:
|
12 | 15 | """
|
|
18 | 21 |
|
19 | 22 |
|
20 | 23 | import torch
|
21 |
| -from torchvision import tv_tensors |
| 24 | +from torchvision.tv_tensors import BoundingBoxes |
22 | 25 | from torchvision.transforms import v2
|
23 | 26 | from helpers import plot
|
24 | 27 |
|
|
37 | 40 | # Creating a Rotated Bounding Box
|
38 | 41 | # -------------------------------
|
39 | 42 | # Rotated bounding boxes are created by instantiating the
|
40 |
| -# :class:`~torchvision.tv_tensors.BoundingBoxes` class. It's the `format` |
| 43 | +# :class:`~torchvision.tv_tensors.BoundingBoxes` class. It's the ``format`` |
41 | 44 | # parameter of the constructor that determines if a bounding box is rotated or
|
42 |
| -# not. In this instance, we use the |
43 |
| -# :attr:`~torchvision.tv_tensors.BoundingBoxFormat` kind `CXCYWHR`. The first |
44 |
| -# two values are the `x` and `y` coordinates of the center of the bounding box. |
45 |
| -# The next two values are the `width` and `height` of the bounding box, and the |
46 |
| -# last value is the `rotation` of the bounding box. |
| 45 | +# not. In this instance, we use the CXCYWHR |
| 46 | +# :attr:`~torchvision.tv_tensors.BoundingBoxFormat`. The first two values are |
| 47 | +# the X and Y coordinates of the center of the bounding box. The next two |
| 48 | +# values are the width and height of the bounding box, and the last value is the |
| 49 | +# rotation of the bounding box, in degrees. |
47 | 50 |
|
48 | 51 |
|
49 |
| -orig_box = tv_tensors.BoundingBoxes( |
| 52 | +orig_box = BoundingBoxes( |
50 | 53 | [
|
51 | 54 | [860.0, 1100, 570, 1840, -7],
|
52 | 55 | ],
|
|
57 | 60 | plot([(orig_img, orig_box)], bbox_width=10)
|
58 | 61 |
|
59 | 62 | # %%
|
60 |
| -# Rotation |
61 |
| -# -------- |
62 |
| -# Rotated bounding boxes maintain their rotation with respect to the image even |
63 |
| -# when the image itself is rotated through the |
64 |
| -# :class:`~torchvision.transforms.RandomRotation` transform. |
| 63 | +# Transforms illustrations |
| 64 | +# ------------------------ |
| 65 | +# |
| 66 | +# Using :class:`~torchvision.transforms.RandomRotation`: |
65 | 67 | rotater = v2.RandomRotation(degrees=(0, 180), expand=True)
|
66 | 68 | rotated_imgs = [rotater((orig_img, orig_box)) for _ in range(4)]
|
67 | 69 | plot([(orig_img, orig_box)] + rotated_imgs, bbox_width=10)
|
68 | 70 |
|
69 | 71 | # %%
|
70 |
| -# Padding |
71 |
| -# ------- |
72 |
| -# Rotated bounding boxes also maintain their properties when the image is padded using |
73 |
| -# :class:`~torchvision.transforms.Pad`. |
| 72 | +# Using :class:`~torchvision.transforms.Pad`: |
74 | 73 | padded_imgs_and_boxes = [
|
75 | 74 | v2.Pad(padding=padding)(orig_img, orig_box)
|
76 | 75 | for padding in (30, 50, 100, 200)
|
77 | 76 | ]
|
78 | 77 | plot([(orig_img, orig_box)] + padded_imgs_and_boxes, bbox_width=10)
|
79 | 78 |
|
80 | 79 | # %%
|
81 |
| -# Resizing |
82 |
| -# -------- |
83 |
| -# Rotated bounding boxes are also resized along with an image in the |
84 |
| -# :class:`~torchvision.transforms.Resize` transform. |
85 |
| -# |
86 |
| -# Note that the bounding box looking bigger in the images with less pixels is |
87 |
| -# an artifact, not reality. That is merely the rasterised representation of the |
88 |
| -# bounding box's boundaries appearing bigger because we specify a fixed width of |
89 |
| -# that rasterized line. When the image is, say, only 30 pixels wide, a |
90 |
| -# line that is 3 pixels wide is relatively large. |
| 80 | +# Using :class:`~torchvision.transforms.Resize`: |
91 | 81 | resized_imgs = [
|
92 | 82 | v2.Resize(size=size)(orig_img, orig_box)
|
93 | 83 | for size in (30, 50, 100, orig_img.size)
|
94 | 84 | ]
|
95 | 85 | plot([(orig_img, orig_box)] + resized_imgs, bbox_width=5)
|
96 | 86 |
|
97 | 87 | # %%
|
98 |
| -# Perspective |
99 |
| -# ----------- |
100 |
| -# The rotated bounding box is also transformed along with the image when the |
101 |
| -# perspective is transformed with :class:`~torchvision.transforms.RandomPerspective`. |
102 |
| -perspective_transformer = v2.RandomPerspective(distortion_scale=0.6, p=1.0) |
103 |
| -perspective_imgs = [perspective_transformer(orig_img, orig_box) for _ in range(4)] |
104 |
| -plot([(orig_img, orig_box)] + perspective_imgs, bbox_width=10) |
105 |
| - |
106 |
| -# %% |
107 |
| -# Elastic Transform |
108 |
| -# ----------------- |
109 |
| -# The rotated bounding box is appropriately unchanged when going through the |
110 |
| -# :class:`~torchvision.transforms.ElasticTransform`. |
111 |
| -elastic_imgs = [ |
112 |
| - v2.ElasticTransform(alpha=alpha)(orig_img, orig_box) |
113 |
| - for alpha in (100.0, 500.0, 1000.0, 2000.0) |
114 |
| -] |
115 |
| -plot([(orig_img, orig_box)] + elastic_imgs, bbox_width=10) |
116 |
| - |
117 |
| -# %% |
118 |
| -# Crop & Clamping Modes |
119 |
| -# --------------------- |
120 |
| -# The :class:`~torchvision.transforms.CenterCrop` transform selectively crops |
121 |
| -# the image on a center location. The behavior of the rotated bounding box |
122 |
| -# depends on its `clamping_mode`. We can set the `clamping_mode` in the |
123 |
| -# :class:`~torchvision.tv_tensors.BoundingBoxes` constructur, or by directly |
124 |
| -# setting it after construction as we do in the example below. |
| 88 | +# Note that the bounding box looking bigger in the images with less pixels is |
| 89 | +# an artifact, not reality. That is merely the rasterised representation of the |
| 90 | +# bounding box's boundaries appearing bigger because we specify a fixed width of |
| 91 | +# that rasterized line. When the image is, say, only 30 pixels wide, a |
| 92 | +# line that is 3 pixels wide is relatively large. |
125 | 93 | #
|
126 |
| -# There are two values for `clamping_mode`: |
| 94 | +# .. _clamping_mode_tuto: |
127 | 95 | #
|
128 |
| -# - `"soft"`: The default when constucting |
129 |
| -# :class:`~torchvision.tv_tensors.BoundingBoxes`. <Insert semantic |
130 |
| -# description for soft mode.> |
131 |
| -# - `"hard"`: <Insert semantic description for hard mode.> |
| 96 | +# Clamping Mode, and its effect on transforms |
| 97 | +# ------------------------------------------- |
132 | 98 | #
|
133 |
| -# For standard bounding boxes, both modes behave the same. We also need to |
134 |
| -# document: |
| 99 | +# Some transforms, such as :class:`~torchvision.transforms.CenterCrop`, may |
| 100 | +# result in having the transformed bounding box partially outside of the |
| 101 | +# transformed (cropped) image. In general, this may happen on most of the |
| 102 | +# :ref:`geometric transforms <v2_api_ref>`. |
135 | 103 | #
|
136 |
| -# - `clamping_mode` for individual kernels. |
137 |
| -# - `clamping_mode` in :class:`~torchvision.transforms.v2.ClampBoundingBoxes`. |
138 |
| -# - the new :class:`~torchvision.transforms.v2.SetClampingMode` transform. |
| 104 | +# In such cases, the bounding box is clamped to the transformed image size based |
| 105 | +# on its ``clamping_mode`` attribute. There are three values for |
| 106 | +# ``clamping_mode``, which determines how the box is clamped after a |
| 107 | +# transformation: |
139 | 108 | #
|
| 109 | +# - ``None``: No clamping is applied, and the bounding box may be partially |
| 110 | +# outside of the image. |
| 111 | +# - `"hard"`: The box is clamped to the image size, such that all its corners |
| 112 | +# are within the image canvas. This potentially results in a loss of |
| 113 | +# information, and it can lead to unintuitive resuts. But may be necessary |
| 114 | +# for some applications e.g. if the model doesn't support boxes outside of |
| 115 | +# their image. |
| 116 | +# - `"soft"`: . This is an intermediate mode between ``None`` and "hard": the |
| 117 | +# box is clamped, but not as strictly as in "hard" mode. Some box dimensions |
| 118 | +# may still be outside of the image. This is the default when constucting |
| 119 | +# :class:`~torchvision.tv_tensors.BoundingBoxes`. |
| 120 | +# |
| 121 | +# .. note:: |
| 122 | +# |
| 123 | +# For axis-aligned bounding boxes, the `"soft"` and `"hard"` modes behave |
| 124 | +# the same, as the bounding box is always clamped to the image size. |
| 125 | +# |
| 126 | +# Let's illustrate the clamping modes with |
| 127 | +# :class:`~torchvision.transforms.CenterCrop` transform: |
| 128 | + |
140 | 129 | assert orig_box.clamping_mode == "soft"
|
141 |
| -hard_box = orig_box.clone() |
142 |
| -hard_box.clamping_mode = "hard" |
143 | 130 |
|
| 131 | +box_hard_clamping = BoundingBoxes(orig_box, format=orig_box.format, canvas_size=orig_box.canvas_size, clamping_mode="hard") |
| 132 | + |
| 133 | +box_no_clamping = BoundingBoxes(orig_box, format=orig_box.format, canvas_size=orig_box.canvas_size, clamping_mode=None) |
| 134 | + |
| 135 | +crop_sizes = (800, 1200, 2000, orig_img.size) |
144 | 136 | soft_center_crops_and_boxes = [
|
145 | 137 | v2.CenterCrop(size=size)(orig_img, orig_box)
|
146 |
| - for size in (800, 1200, 2000, orig_img.size) |
| 138 | + for size in crop_sizes |
147 | 139 | ]
|
148 | 140 |
|
149 | 141 | hard_center_crops_and_boxes = [
|
150 |
| - v2.CenterCrop(size=size)(orig_img, hard_box) |
151 |
| - for size in (800, 1200, 2000, orig_img.size) |
| 142 | + v2.CenterCrop(size=size)(orig_img, box_hard_clamping) |
| 143 | + for size in crop_sizes |
| 144 | +] |
| 145 | + |
| 146 | +no_clamping_center_crops_and_boxes = [ |
| 147 | + v2.CenterCrop(size=size)(orig_img, box_no_clamping) |
| 148 | + for size in crop_sizes |
152 | 149 | ]
|
153 | 150 |
|
154 |
| -plot([[(orig_img, orig_box)] + soft_center_crops_and_boxes, |
155 |
| - [(orig_img, hard_box)] + hard_center_crops_and_boxes], |
| 151 | +plot([[(orig_img, box_hard_clamping)] + hard_center_crops_and_boxes, |
| 152 | + [(orig_img, orig_box)] + soft_center_crops_and_boxes, |
| 153 | + [(orig_img, box_no_clamping)] + no_clamping_center_crops_and_boxes], |
156 | 154 | bbox_width=10)
|
| 155 | + |
| 156 | +# %% |
| 157 | +# The plot above shows the "hard" clamping mode, "soft" and ``None``, in this |
| 158 | +# order. While "soft" and ``None`` result in similar plots, they do not lead to |
| 159 | +# the exact same clamped boxes. The non-clamped boxes will show dimensions that are further away from the image: |
| 160 | +print("boxes with soft clamping:") |
| 161 | +print(soft_center_crops_and_boxes) |
| 162 | +print() |
| 163 | +print("boxes with no clamping:") |
| 164 | +print(no_clamping_center_crops_and_boxes) |
| 165 | + |
| 166 | +# %% |
| 167 | +# |
| 168 | +# Setting the clamping mode |
| 169 | +# -------------------------- |
| 170 | +# |
| 171 | +# The ``clamping_mode`` attribute, which determines the clamping strategy that |
| 172 | +# is applied to a box, can be set in different ways: |
| 173 | +# |
| 174 | +# - When constructing the bounding box with its |
| 175 | +# :class:`~torchvision.tv_tensors.BoundingBoxes` constructor, as done in the example above. |
| 176 | +# - By directly setting the attribute on an existing instance, e.g. ``boxes.clamping_mode = "hard"``. |
| 177 | +# - By calling the :class:`~torchvision.transforms.v2.SetClampingMode` transform. |
| 178 | +# |
| 179 | +# Also, remember that you can always clamp the bounding box manually by |
| 180 | +# calling the :meth:`~torchvision.transforms.v2.ClampBoundingBoxes` transform! |
| 181 | +# Here's an example illustrating all of these option: |
| 182 | + |
| 183 | +t = v2.Compose([ |
| 184 | + v2.CenterCrop(size=(800,)), # clamps according to the current clamping_mode |
| 185 | + # attribute, in this case set by the constructor |
| 186 | + v2.SetClampingMode(None), # sets the clamping_mode attribute for future transforms |
| 187 | + v2.Pad(padding=3), # clamps according to the current clamping_mode |
| 188 | + # i.e. ``None`` |
| 189 | + v2.ClampBoundingBoxes(clamping_mode="soft"), # clamps with "soft" mode. |
| 190 | +]) |
| 191 | + |
| 192 | +out_img, out_box = t(orig_img, orig_box) |
| 193 | +plot([(orig_img, orig_box), (out_img, out_box)], bbox_width=10) |
| 194 | + |
| 195 | +# %% |
0 commit comments