Skip to content

Commit 784bd76

Browse files
authored
Merge branch 'main' into arm-adjust-hue-fix
2 parents b6d16e6 + ed55b03 commit 784bd76

23 files changed

+327
-183
lines changed

.github/workflows/build-wheels-windows.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ jobs:
2525
os: windows
2626
test-infra-repository: pytorch/test-infra
2727
test-infra-ref: main
28+
with-xpu: enable
2829
build:
2930
needs: generate-matrix
3031
strategy:

.github/workflows/update-viablestrict.yml

Lines changed: 0 additions & 24 deletions
This file was deleted.

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ versions.
2121
| `torch` | `torchvision` | Python |
2222
| ------------------ | ------------------ | ------------------- |
2323
| `main` / `nightly` | `main` / `nightly` | `>=3.9`, `<=3.12` |
24+
| `2.5` | `0.20` | `>=3.9`, `<=3.12` |
2425
| `2.4` | `0.19` | `>=3.8`, `<=3.12` |
2526
| `2.3` | `0.18` | `>=3.8`, `<=3.12` |
2627
| `2.2` | `0.17` | `>=3.8`, `<=3.11` |

docs/source/io.rst

Lines changed: 67 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -3,46 +3,98 @@ Decoding / Encoding images and videos
33

44
.. currentmodule:: torchvision.io
55

6-
The :mod:`torchvision.io` package provides functions for performing IO
7-
operations. They are currently specific to reading and writing images and
8-
videos.
6+
The :mod:`torchvision.io` module provides utilities for decoding and encoding
7+
images and videos.
98

10-
Images
11-
------
9+
Image Decoding
10+
--------------
1211

1312
Torchvision currently supports decoding JPEG, PNG, WEBP and GIF images. JPEG
1413
decoding can also be done on CUDA GPUs.
1514

16-
For encoding, JPEG (cpu and CUDA) and PNG are supported.
15+
The main entry point is the :func:`~torchvision.io.decode_image` function, which
16+
you can use as an alternative to ``PIL.Image.open()``. It will decode images
17+
straight into image Tensors, thus saving you the conversion and allowing you to
18+
run transforms/preproc natively on tensors.
19+
20+
.. code::
21+
22+
from torchvision.io import decode_image
23+
24+
img = decode_image("path_to_image", mode="RGB")
25+
img.dtype # torch.uint8
26+
27+
# Or
28+
raw_encoded_bytes = ... # read encoded bytes from your file system
29+
img = decode_image(raw_encoded_bytes, mode="RGB")
30+
31+
32+
:func:`~torchvision.io.decode_image` will automatically detect the image format,
33+
and call the corresponding decoder. You can also use the lower-level
34+
format-specific decoders which can be more powerful, e.g. if you want to
35+
encode/decode JPEGs on CUDA.
1736

1837
.. autosummary::
1938
:toctree: generated/
2039
:template: function.rst
2140

22-
read_image
2341
decode_image
24-
encode_jpeg
2542
decode_jpeg
26-
write_jpeg
43+
encode_png
2744
decode_gif
2845
decode_webp
29-
encode_png
30-
decode_png
31-
write_png
32-
read_file
33-
write_file
3446

3547
.. autosummary::
3648
:toctree: generated/
3749
:template: class.rst
3850

3951
ImageReadMode
4052

53+
Obsolete decoding function:
4154

55+
.. autosummary::
56+
:toctree: generated/
57+
:template: function.rst
58+
59+
read_image
60+
61+
Image Encoding
62+
--------------
63+
64+
For encoding, JPEG (cpu and CUDA) and PNG are supported.
65+
66+
67+
.. autosummary::
68+
:toctree: generated/
69+
:template: function.rst
70+
71+
encode_jpeg
72+
write_jpeg
73+
encode_png
74+
write_png
75+
76+
IO operations
77+
-------------
78+
79+
.. autosummary::
80+
:toctree: generated/
81+
:template: function.rst
82+
83+
read_file
84+
write_file
4285

4386
Video
4487
-----
4588

89+
.. warning::
90+
91+
Torchvision supports video decoding through different APIs listed below,
92+
some of which are still in BETA stage. In the near future, we intend to
93+
centralize PyTorch's video decoding capabilities within the `torchcodec
94+
<https://github.com/pytorch/torchcodec>`_ project. We encourage you to try
95+
it out and share your feedback, as the torchvision video decoders will
96+
eventually be deprecated.
97+
4698
.. autosummary::
4799
:toctree: generated/
48100
:template: function.rst
@@ -52,45 +104,14 @@ Video
52104
write_video
53105

54106

55-
Fine-grained video API
56-
^^^^^^^^^^^^^^^^^^^^^^
107+
**Fine-grained video API**
57108

58109
In addition to the :mod:`read_video` function, we provide a high-performance
59110
lower-level API for more fine-grained control compared to the :mod:`read_video` function.
60111
It does all this whilst fully supporting torchscript.
61112

62-
.. betastatus:: fine-grained video API
63-
64113
.. autosummary::
65114
:toctree: generated/
66115
:template: class.rst
67116

68117
VideoReader
69-
70-
71-
Example of inspecting a video:
72-
73-
.. code:: python
74-
75-
import torchvision
76-
video_path = "path to a test video"
77-
# Constructor allocates memory and a threaded decoder
78-
# instance per video. At the moment it takes two arguments:
79-
# path to the video file, and a wanted stream.
80-
reader = torchvision.io.VideoReader(video_path, "video")
81-
82-
# The information about the video can be retrieved using the
83-
# `get_metadata()` method. It returns a dictionary for every stream, with
84-
# duration and other relevant metadata (often frame rate)
85-
reader_md = reader.get_metadata()
86-
87-
# metadata is structured as a dict of dicts with following structure
88-
# {"stream_type": {"attribute": [attribute per stream]}}
89-
#
90-
# following would print out the list of frame rates for every present video stream
91-
print(reader_md["video"]["fps"])
92-
93-
# we explicitly select the stream we would like to operate on. In
94-
# the constructor we select a default video stream, but
95-
# in practice, we can set whichever stream we would like
96-
video.set_current_stream("video:0")

docs/source/models.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -226,10 +226,10 @@ Here is an example of how to use the pre-trained image classification models:
226226

227227
.. code:: python
228228
229-
from torchvision.io import read_image
229+
from torchvision.io import decode_image
230230
from torchvision.models import resnet50, ResNet50_Weights
231231
232-
img = read_image("test/assets/encode_jpeg/grace_hopper_517x606.jpg")
232+
img = decode_image("test/assets/encode_jpeg/grace_hopper_517x606.jpg")
233233
234234
# Step 1: Initialize model with the best available weights
235235
weights = ResNet50_Weights.DEFAULT
@@ -283,10 +283,10 @@ Here is an example of how to use the pre-trained quantized image classification
283283

284284
.. code:: python
285285
286-
from torchvision.io import read_image
286+
from torchvision.io import decode_image
287287
from torchvision.models.quantization import resnet50, ResNet50_QuantizedWeights
288288
289-
img = read_image("test/assets/encode_jpeg/grace_hopper_517x606.jpg")
289+
img = decode_image("test/assets/encode_jpeg/grace_hopper_517x606.jpg")
290290
291291
# Step 1: Initialize model with the best available weights
292292
weights = ResNet50_QuantizedWeights.DEFAULT
@@ -339,11 +339,11 @@ Here is an example of how to use the pre-trained semantic segmentation models:
339339

340340
.. code:: python
341341
342-
from torchvision.io.image import read_image
342+
from torchvision.io.image import decode_image
343343
from torchvision.models.segmentation import fcn_resnet50, FCN_ResNet50_Weights
344344
from torchvision.transforms.functional import to_pil_image
345345
346-
img = read_image("gallery/assets/dog1.jpg")
346+
img = decode_image("gallery/assets/dog1.jpg")
347347
348348
# Step 1: Initialize model with the best available weights
349349
weights = FCN_ResNet50_Weights.DEFAULT
@@ -411,12 +411,12 @@ Here is an example of how to use the pre-trained object detection models:
411411
.. code:: python
412412
413413
414-
from torchvision.io.image import read_image
414+
from torchvision.io.image import decode_image
415415
from torchvision.models.detection import fasterrcnn_resnet50_fpn_v2, FasterRCNN_ResNet50_FPN_V2_Weights
416416
from torchvision.utils import draw_bounding_boxes
417417
from torchvision.transforms.functional import to_pil_image
418418
419-
img = read_image("test/assets/encode_jpeg/grace_hopper_517x606.jpg")
419+
img = decode_image("test/assets/encode_jpeg/grace_hopper_517x606.jpg")
420420
421421
# Step 1: Initialize model with the best available weights
422422
weights = FasterRCNN_ResNet50_FPN_V2_Weights.DEFAULT

gallery/others/plot_repurposing_annotations.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -66,12 +66,12 @@ def show(imgs):
6666
# We will take images and masks from the `PenFudan Dataset <https://www.cis.upenn.edu/~jshi/ped_html/>`_.
6767

6868

69-
from torchvision.io import read_image
69+
from torchvision.io import decode_image
7070

7171
img_path = os.path.join(ASSETS_DIRECTORY, "FudanPed00054.png")
7272
mask_path = os.path.join(ASSETS_DIRECTORY, "FudanPed00054_mask.png")
73-
img = read_image(img_path)
74-
mask = read_image(mask_path)
73+
img = decode_image(img_path)
74+
mask = decode_image(mask_path)
7575

7676

7777
# %%
@@ -181,8 +181,8 @@ def __getitem__(self, idx):
181181
img_path = os.path.join(self.root, "PNGImages", self.imgs[idx])
182182
mask_path = os.path.join(self.root, "PedMasks", self.masks[idx])
183183

184-
img = read_image(img_path)
185-
mask = read_image(mask_path)
184+
img = decode_image(img_path)
185+
mask = decode_image(mask_path)
186186

187187
img = F.convert_image_dtype(img, dtype=torch.float)
188188
mask = F.convert_image_dtype(mask, dtype=torch.float)

gallery/others/plot_scripted_tensor_transforms.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
import torch.nn as nn
2222

2323
import torchvision.transforms as v1
24-
from torchvision.io import read_image
24+
from torchvision.io import decode_image
2525

2626
plt.rcParams["savefig.bbox"] = 'tight'
2727
torch.manual_seed(1)
@@ -39,8 +39,8 @@
3939
# :class:`torch.nn.Sequential` instead of
4040
# :class:`~torchvision.transforms.v2.Compose`:
4141

42-
dog1 = read_image(str(ASSETS_PATH / 'dog1.jpg'))
43-
dog2 = read_image(str(ASSETS_PATH / 'dog2.jpg'))
42+
dog1 = decode_image(str(ASSETS_PATH / 'dog1.jpg'))
43+
dog2 = decode_image(str(ASSETS_PATH / 'dog2.jpg'))
4444

4545
transforms = torch.nn.Sequential(
4646
v1.RandomCrop(224),

gallery/others/plot_visualization_utils.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,11 @@ def show(imgs):
4242
# image of dtype ``uint8`` as input.
4343

4444
from torchvision.utils import make_grid
45-
from torchvision.io import read_image
45+
from torchvision.io import decode_image
4646
from pathlib import Path
4747

48-
dog1_int = read_image(str(Path('../assets') / 'dog1.jpg'))
49-
dog2_int = read_image(str(Path('../assets') / 'dog2.jpg'))
48+
dog1_int = decode_image(str(Path('../assets') / 'dog1.jpg'))
49+
dog2_int = decode_image(str(Path('../assets') / 'dog2.jpg'))
5050
dog_list = [dog1_int, dog2_int]
5151

5252
grid = make_grid(dog_list)
@@ -362,9 +362,9 @@ def show(imgs):
362362
#
363363

364364
from torchvision.models.detection import keypointrcnn_resnet50_fpn, KeypointRCNN_ResNet50_FPN_Weights
365-
from torchvision.io import read_image
365+
from torchvision.io import decode_image
366366

367-
person_int = read_image(str(Path("../assets") / "person1.jpg"))
367+
person_int = decode_image(str(Path("../assets") / "person1.jpg"))
368368

369369
weights = KeypointRCNN_ResNet50_FPN_Weights.DEFAULT
370370
transforms = weights.transforms()

gallery/transforms/plot_transforms_getting_started.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,14 @@
2121
plt.rcParams["savefig.bbox"] = 'tight'
2222

2323
from torchvision.transforms import v2
24-
from torchvision.io import read_image
24+
from torchvision.io import decode_image
2525

2626
torch.manual_seed(1)
2727

2828
# If you're trying to run that on Colab, you can download the assets and the
2929
# helpers from https://github.com/pytorch/vision/tree/main/gallery/
3030
from helpers import plot
31-
img = read_image(str(Path('../assets') / 'astronaut.jpg'))
31+
img = decode_image(str(Path('../assets') / 'astronaut.jpg'))
3232
print(f"{type(img) = }, {img.dtype = }, {img.shape = }")
3333

3434
# %%

packaging/windows/internal/vc_env_helper.bat

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ if "%VSDEVCMD_ARGS%" == "" (
2828

2929
@echo on
3030

31+
if "%CU_VERSION%" == "xpu" call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
32+
3133
set DISTUTILS_USE_SDK=1
3234

3335
set args=%1

0 commit comments

Comments
 (0)