| 
3 | 3 | Transforms on Rotated Bounding Boxes  | 
4 | 4 | ===============================================================  | 
5 | 5 | 
  | 
6 |  | -This example illustrates how to define and use rotated bounding boxes. We'll  | 
7 |  | -cover how to define them, demonstrate their usage with some of the existing  | 
8 |  | -transforms, and finally some of their unique behavior in comparision to  | 
9 |  | -standard bounding boxes.  | 
 | 6 | +This example illustrates how to define and use rotated bounding boxes.  | 
 | 7 | +
  | 
 | 8 | +.. note::  | 
 | 9 | +    Support for rotated bounding boxes was released in TorchVision 0.23 and is  | 
 | 10 | +    currently a BETA feature. We don't expect the API to change, but there may  | 
 | 11 | +    be some rare edge-cases. If you find any issues, please report them on  | 
 | 12 | +    our bug tracker: https://github.com/pytorch/vision/issues?q=is:open+is:issue  | 
10 | 13 | 
  | 
11 | 14 | First, a bit of setup code:  | 
12 | 15 | """  | 
 | 
18 | 21 | 
 
  | 
19 | 22 | 
 
  | 
20 | 23 | import torch  | 
21 |  | -from torchvision import tv_tensors  | 
 | 24 | +from torchvision.tv_tensors import BoundingBoxes  | 
22 | 25 | from torchvision.transforms import v2  | 
23 | 26 | from helpers import plot  | 
24 | 27 | 
 
  | 
 | 
37 | 40 | # Creating a Rotated Bounding Box  | 
38 | 41 | # -------------------------------  | 
39 | 42 | # Rotated bounding boxes are created by instantiating the  | 
40 |  | -# :class:`~torchvision.tv_tensors.BoundingBoxes` class. It's the `format`  | 
 | 43 | +# :class:`~torchvision.tv_tensors.BoundingBoxes` class. It's the ``format``  | 
41 | 44 | # parameter of the constructor that determines if a bounding box is rotated or  | 
42 |  | -# not. In this instance, we use the  | 
43 |  | -# :attr:`~torchvision.tv_tensors.BoundingBoxFormat` kind `CXCYWHR`. The first  | 
44 |  | -# two values are the `x` and `y` coordinates of the center of the bounding box.  | 
45 |  | -# The next two values are the `width` and `height` of the bounding box, and the  | 
46 |  | -# last value is the `rotation` of the bounding box.  | 
 | 45 | +# not. In this instance, we use the CXCYWHR  | 
 | 46 | +# :attr:`~torchvision.tv_tensors.BoundingBoxFormat`. The first two values are  | 
 | 47 | +# the X and Y coordinates of the center of the bounding box.  The next two  | 
 | 48 | +# values are the width and height of the bounding box, and the last value is the  | 
 | 49 | +# rotation of the bounding box, in degrees.  | 
47 | 50 | 
 
  | 
48 | 51 | 
 
  | 
49 |  | -orig_box = tv_tensors.BoundingBoxes(  | 
 | 52 | +orig_box = BoundingBoxes(  | 
50 | 53 |     [  | 
51 | 54 |         [860.0, 1100, 570, 1840, -7],  | 
52 | 55 |     ],  | 
 | 
57 | 60 | plot([(orig_img, orig_box)], bbox_width=10)  | 
58 | 61 | 
 
  | 
59 | 62 | # %%  | 
60 |  | -# Rotation  | 
61 |  | -# --------  | 
62 |  | -# Rotated bounding boxes maintain their rotation with respect to the image even  | 
63 |  | -# when the image itself is rotated through the  | 
64 |  | -# :class:`~torchvision.transforms.RandomRotation` transform.  | 
 | 63 | +# Transforms illustrations  | 
 | 64 | +# ------------------------  | 
 | 65 | +#  | 
 | 66 | +# Using :class:`~torchvision.transforms.RandomRotation`:  | 
65 | 67 | rotater = v2.RandomRotation(degrees=(0, 180), expand=True)  | 
66 | 68 | rotated_imgs = [rotater((orig_img, orig_box)) for _ in range(4)]  | 
67 | 69 | plot([(orig_img, orig_box)] + rotated_imgs, bbox_width=10)  | 
68 | 70 | 
 
  | 
69 | 71 | # %%  | 
70 |  | -# Padding  | 
71 |  | -# -------  | 
72 |  | -# Rotated bounding boxes also maintain their properties when the image is padded using  | 
73 |  | -# :class:`~torchvision.transforms.Pad`.  | 
 | 72 | +# Using :class:`~torchvision.transforms.Pad`:  | 
74 | 73 | padded_imgs_and_boxes = [  | 
75 | 74 |     v2.Pad(padding=padding)(orig_img, orig_box)  | 
76 | 75 |     for padding in (30, 50, 100, 200)  | 
77 | 76 | ]  | 
78 | 77 | plot([(orig_img, orig_box)] + padded_imgs_and_boxes, bbox_width=10)  | 
79 | 78 | 
 
  | 
80 | 79 | # %%  | 
81 |  | -# Resizing  | 
82 |  | -# --------  | 
83 |  | -# Rotated bounding boxes are also resized along with an image in the  | 
84 |  | -# :class:`~torchvision.transforms.Resize` transform.  | 
85 |  | -#  | 
86 |  | -# Note that the bounding box looking bigger in the images with less pixels is  | 
87 |  | -# an artifact, not reality. That is merely the rasterised representation of the  | 
88 |  | -# bounding box's boundaries appearing bigger because we specify a fixed width of  | 
89 |  | -# that rasterized line. When the image is, say, only 30 pixels wide, a  | 
90 |  | -# line that is 3 pixels wide is relatively large.  | 
 | 80 | +# Using :class:`~torchvision.transforms.Resize`:  | 
91 | 81 | resized_imgs = [  | 
92 | 82 |     v2.Resize(size=size)(orig_img, orig_box)  | 
93 | 83 |     for size in (30, 50, 100, orig_img.size)  | 
94 | 84 | ]  | 
95 | 85 | plot([(orig_img, orig_box)] + resized_imgs, bbox_width=5)  | 
96 | 86 | 
 
  | 
97 | 87 | # %%  | 
98 |  | -# Perspective  | 
99 |  | -# -----------  | 
100 |  | -# The rotated bounding box is also transformed along with the image when the  | 
101 |  | -# perspective is transformed with :class:`~torchvision.transforms.RandomPerspective`.  | 
102 |  | -perspective_transformer = v2.RandomPerspective(distortion_scale=0.6, p=1.0)  | 
103 |  | -perspective_imgs = [perspective_transformer(orig_img, orig_box) for _ in range(4)]  | 
104 |  | -plot([(orig_img, orig_box)] + perspective_imgs, bbox_width=10)  | 
105 |  | - | 
106 |  | -# %%  | 
107 |  | -# Elastic Transform  | 
108 |  | -# -----------------  | 
109 |  | -# The rotated bounding box is appropriately unchanged when going through the  | 
110 |  | -# :class:`~torchvision.transforms.ElasticTransform`.  | 
111 |  | -elastic_imgs = [  | 
112 |  | -    v2.ElasticTransform(alpha=alpha)(orig_img, orig_box)  | 
113 |  | -    for alpha in (100.0, 500.0, 1000.0, 2000.0)  | 
114 |  | -]  | 
115 |  | -plot([(orig_img, orig_box)] + elastic_imgs, bbox_width=10)  | 
116 |  | - | 
117 |  | -# %%  | 
118 |  | -# Crop & Clamping Modes  | 
119 |  | -# ---------------------  | 
120 |  | -# The :class:`~torchvision.transforms.CenterCrop` transform selectively crops  | 
121 |  | -# the image on a center location. The behavior of the rotated bounding box  | 
122 |  | -# depends on its `clamping_mode`. We can set the `clamping_mode` in the  | 
123 |  | -# :class:`~torchvision.tv_tensors.BoundingBoxes` constructur, or by directly  | 
124 |  | -# setting it after construction as we do in the example below.  | 
 | 88 | +# Note that the bounding box looking bigger in the images with less pixels is  | 
 | 89 | +# an artifact, not reality. That is merely the rasterised representation of the  | 
 | 90 | +# bounding box's boundaries appearing bigger because we specify a fixed width of  | 
 | 91 | +# that rasterized line. When the image is, say, only 30 pixels wide, a  | 
 | 92 | +# line that is 3 pixels wide is relatively large.  | 
125 | 93 | #  | 
126 |  | -# There are two values for `clamping_mode`:  | 
 | 94 | +# .. _clamping_mode_tuto:  | 
127 | 95 | #  | 
128 |  | -#  - `"soft"`: The default when constucting  | 
129 |  | -#    :class:`~torchvision.tv_tensors.BoundingBoxes`. <Insert semantic  | 
130 |  | -#    description for soft mode.>  | 
131 |  | -#  - `"hard"`: <Insert semantic description for hard mode.>  | 
 | 96 | +# Clamping Mode, and its effect on transforms  | 
 | 97 | +# -------------------------------------------  | 
132 | 98 | #  | 
133 |  | -# For standard bounding boxes, both modes behave the same. We also need to  | 
134 |  | -# document:  | 
 | 99 | +# Some transforms, such as :class:`~torchvision.transforms.CenterCrop`, may  | 
 | 100 | +# result in having the transformed bounding box partially outside of the  | 
 | 101 | +# transformed (cropped) image. In general, this may happen on most of the  | 
 | 102 | +# :ref:`geometric transforms <v2_api_ref>`.  | 
135 | 103 | #  | 
136 |  | -#  - `clamping_mode` for individual kernels.  | 
137 |  | -#  - `clamping_mode` in :class:`~torchvision.transforms.v2.ClampBoundingBoxes`.  | 
138 |  | -#  - the new :class:`~torchvision.transforms.v2.SetClampingMode` transform.  | 
 | 104 | +# In such cases, the bounding box is clamped to the transformed image size based  | 
 | 105 | +# on its ``clamping_mode`` attribute.  There are three values for  | 
 | 106 | +# ``clamping_mode``, which determines how the box is clamped after a  | 
 | 107 | +# transformation:  | 
139 | 108 | #  | 
 | 109 | +#  - ``None``: No clamping is applied, and the bounding box may be partially  | 
 | 110 | +#    outside of the image.  | 
 | 111 | +#  - `"hard"`:  The box is clamped to the image size, such that all its corners  | 
 | 112 | +#    are within the image canvas. This potentially results in a loss of  | 
 | 113 | +#    information, and it can lead to unintuitive resuts. But may be necessary  | 
 | 114 | +#    for some applications e.g. if the model doesn't support boxes outside of  | 
 | 115 | +#    their image.  | 
 | 116 | +#  - `"soft"`: . This is an intermediate mode between ``None`` and "hard": the  | 
 | 117 | +#    box is clamped, but not as strictly as in "hard" mode. Some box dimensions  | 
 | 118 | +#    may still be outside of the image. This is the default when constucting  | 
 | 119 | +#    :class:`~torchvision.tv_tensors.BoundingBoxes`.  | 
 | 120 | +#  | 
 | 121 | +# .. note::  | 
 | 122 | +#  | 
 | 123 | +#       For axis-aligned bounding boxes, the `"soft"` and `"hard"` modes behave  | 
 | 124 | +#       the same, as the bounding box is always clamped to the image size.  | 
 | 125 | +#  | 
 | 126 | +# Let's illustrate the clamping modes with  | 
 | 127 | +# :class:`~torchvision.transforms.CenterCrop` transform:  | 
 | 128 | + | 
140 | 129 | assert orig_box.clamping_mode == "soft"  | 
141 |  | -hard_box = orig_box.clone()  | 
142 |  | -hard_box.clamping_mode = "hard"  | 
143 | 130 | 
 
  | 
 | 131 | +box_hard_clamping = BoundingBoxes(orig_box, format=orig_box.format, canvas_size=orig_box.canvas_size, clamping_mode="hard")  | 
 | 132 | + | 
 | 133 | +box_no_clamping = BoundingBoxes(orig_box, format=orig_box.format, canvas_size=orig_box.canvas_size, clamping_mode=None)  | 
 | 134 | + | 
 | 135 | +crop_sizes = (800, 1200, 2000, orig_img.size)  | 
144 | 136 | soft_center_crops_and_boxes = [  | 
145 | 137 |     v2.CenterCrop(size=size)(orig_img, orig_box)  | 
146 |  | -    for size in (800, 1200, 2000, orig_img.size)  | 
 | 138 | +    for size in crop_sizes  | 
147 | 139 | ]  | 
148 | 140 | 
 
  | 
149 | 141 | hard_center_crops_and_boxes = [  | 
150 |  | -    v2.CenterCrop(size=size)(orig_img, hard_box)  | 
151 |  | -    for size in (800, 1200, 2000, orig_img.size)  | 
 | 142 | +    v2.CenterCrop(size=size)(orig_img, box_hard_clamping)  | 
 | 143 | +    for size in crop_sizes  | 
 | 144 | +]  | 
 | 145 | + | 
 | 146 | +no_clamping_center_crops_and_boxes = [  | 
 | 147 | +    v2.CenterCrop(size=size)(orig_img, box_no_clamping)  | 
 | 148 | +    for size in crop_sizes  | 
152 | 149 | ]  | 
153 | 150 | 
 
  | 
154 |  | -plot([[(orig_img, orig_box)] + soft_center_crops_and_boxes,  | 
155 |  | -      [(orig_img, hard_box)] + hard_center_crops_and_boxes],  | 
 | 151 | +plot([[(orig_img, box_hard_clamping)] + hard_center_crops_and_boxes,  | 
 | 152 | +      [(orig_img, orig_box)] + soft_center_crops_and_boxes,  | 
 | 153 | +      [(orig_img, box_no_clamping)] + no_clamping_center_crops_and_boxes],  | 
156 | 154 |      bbox_width=10)  | 
 | 155 | + | 
 | 156 | +# %%  | 
 | 157 | +# The plot above shows the "hard" clamping mode, "soft" and ``None``, in this  | 
 | 158 | +# order. While "soft" and ``None`` result in similar plots, they do not lead to  | 
 | 159 | +# the exact same clamped boxes. The non-clamped boxes will show dimensions that are further away from the image:  | 
 | 160 | +print("boxes with soft clamping:")  | 
 | 161 | +print(soft_center_crops_and_boxes)  | 
 | 162 | +print()  | 
 | 163 | +print("boxes with no clamping:")  | 
 | 164 | +print(no_clamping_center_crops_and_boxes)  | 
 | 165 | + | 
 | 166 | +# %%  | 
 | 167 | +#  | 
 | 168 | +# Setting the clamping mode  | 
 | 169 | +# --------------------------  | 
 | 170 | +#  | 
 | 171 | +# The ``clamping_mode`` attribute, which determines the clamping strategy that  | 
 | 172 | +# is applied to a box, can be set in different ways:  | 
 | 173 | +#  | 
 | 174 | +# - When constructing the bounding box with its  | 
 | 175 | +#   :class:`~torchvision.tv_tensors.BoundingBoxes` constructor, as done in the example above.  | 
 | 176 | +# - By directly setting the attribute on an existing instance, e.g. ``boxes.clamping_mode = "hard"``.  | 
 | 177 | +# - By calling the :class:`~torchvision.transforms.v2.SetClampingMode` transform.  | 
 | 178 | +#  | 
 | 179 | +# Also, remember that you can always clamp the bounding box manually by  | 
 | 180 | +# calling the :meth:`~torchvision.transforms.v2.ClampBoundingBoxes` transform!  | 
 | 181 | +# Here's an example illustrating all of these option:  | 
 | 182 | + | 
 | 183 | +t = v2.Compose([  | 
 | 184 | +    v2.CenterCrop(size=(800,)),  # clamps according to the current clamping_mode  | 
 | 185 | +                                 # attribute, in this case set by the constructor  | 
 | 186 | +    v2.SetClampingMode(None),  # sets the clamping_mode attribute for future transforms  | 
 | 187 | +    v2.Pad(padding=3),  # clamps according to the current clamping_mode  | 
 | 188 | +                        # i.e. ``None``  | 
 | 189 | +    v2.ClampBoundingBoxes(clamping_mode="soft"),  # clamps with "soft" mode.  | 
 | 190 | +])  | 
 | 191 | + | 
 | 192 | +out_img, out_box = t(orig_img, orig_box)  | 
 | 193 | +plot([(orig_img, orig_box), (out_img, out_box)], bbox_width=10)  | 
 | 194 | + | 
 | 195 | +# %%  | 
0 commit comments