Key Updates
- Add torch.compile and CuDNN support for AMD ROCm7 supported GPUs with 1.5~2x performance
- Add CoreML for Apple Silicon (MPS) devices with 2x performance
- Enhance the TensorRT performance and compatibility for NVIDIA GPUs (Support more models including DA3MONO, DA3METRIC)
- Add XPU Pytorch backend with OpenVINO optimizer for Intel GPUs. (Better than DirectML, requires 11th Gen and newer iGPUs and dGPUs)
- Enable automatic inference optimizer options for supported hardware
- Enable GPU OpenCL acceleration for the Screen Capture process with lower CPU usage
- Change RTMP server backend with SRT protocol on Windows with lower latency on supported RTSP and WebRTC clients
- Correct depth calculation and normalization for DA3 and metric models
- Update ROCm7 hardware installation script with official ROCm7.2 (9700/9070/9060/7900/7800/7700/840M/860M/880M/890M/8060s) and TheRock 7.1.2 for (Almost all RDNA2/3/4 GPU/APUs)
- Fix Video Depth Anything torch.compile failure and enable DA3-Small/Base/Large/Giant tensorrt support
Performance Benchmark
Desktop2Stereo Performance Win/Mac
Alternative Download Link
- Quark NetDrive
Access code:1vcn - Baidu Netdisk
Access code:mr64