[RFC] Provide full XPU feature support in Torchvision

# RFC: Support for Intel XPU Devices in torchvision

## Summary

This RFC proposes modifications to `torchvision` to support Intel XPU devices. The primary focus is on updating the test suite to accommodate XPU. Since existing implementations of torchvision ops exist in [torch-xpu-ops](https://github.com/intel/torch-xpu-ops), some XPU support already exists in torchvision. This proposal is to expand torchvision so XPU devices have feature parity with CUDA.

## Detailed Proposal 

- [x] 0. **Implement torchvision ops for XPU**
As mentioned above, this has already been done in [torch-xpu-ops](https://github.com/intel/torch-xpu-ops)  (see [#1290](https://github.com/intel/torch-xpu-ops/issues/1290)), so XPU builds currently support `nms, deform_conv2d, roi_align, roi_pool, ps_roi_align, ps_roi_pool` and corresponding backward ops.
  - [ ] Autocasting and autograd support require minor changes in torchvision.
- [ ] 1. **Modify Test Suite**
Where existing code tests CUDA devices, we would add equivalent XPU tests, as supported.
The primary change involves updating the test suite to support XPU devices. This includes tests for operators (`test/test_ops.py`), transforms (`test/test_transforms_tensor.py`, `test/test_transforms_v2.py`), and models.
- [ ] 2. **Video/Image Features**
Based on the H1 2025 TorchCodec [Roadmap](https://drive.google.com/file/d/1DCqN0gAVnqD2kfJMQQbVbKYGPq32BrFi/view), we do not plan to support image/video encoding/decoding within torchvision, and would instead add XPU functionality to TorchCodec.
- [ ] 3. **Update Documentation, Scripts, and Benchmarks**
Efforts will focus on supporting XPU in benchmarking scripts, with a few minor updates to documentation to specify XPU availability. We will not focus on updates to gallery and references scripts in most cases.
- [ ] 4. **Continuous Integration**
Update the CI configuration to include tests for XPU devices, ensuring that changes are validated across both CUDA and XPU environments. Based on feedback, XPU tests could be part of a specific workflow label (like `ciflow/xpu` in pytorch/pytorch).

## Alternatives and Open Questions

- **Separate Test Files** - Instead of modifying existing tests, create separate test files specifically for XPU. This approach, however, may lead to code duplication and maintenance challenges.
- **Should operators be written in Triton?** -  As proposed in [#8746](https://github.com/pytorch/vision/pull/8746), ops could be implemented in Triton, providing a single common code for GPUs from different vendors and reducing code duplication. This would require additional engineering effort, but is quite limited in scale, with only 6 CUDA/SYCL ops currently in Torchvision.
- **Should basic XPU image/video encoding/decoding be provided in Torchvision?** - To provide API consistency, encoding/decoding could be supported in torchvision. This may depend on if/when functionality will be deprecated/removed and moved to torchcodec.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Provide full XPU feature support in Torchvision #1

RFC: Support for Intel XPU Devices in torchvision

Summary

Detailed Proposal

Alternatives and Open Questions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[RFC] Provide full XPU feature support in Torchvision #1

Description

RFC: Support for Intel XPU Devices in torchvision

Summary

Detailed Proposal

Alternatives and Open Questions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions