- Image or Video
The script will perform a monocular depth estimation on the input media.
- Depth image
Estimated relative depth with inferno colormap(without option -g),
or single channel grey scale image(with option -g).
Saves to ./output.png by default but it can be specified with the -s option
Internet connection is required when running the script for the first time, as it will download the necessary model files.
Running this script will estimate the relative depth of the input image/video. The results will be shown in a separate window(when inferencing on image and video), or saved as an image(when inferencing on image).
$ python3 depth_anything_v2.pyThe result will be saved to output.png by default.
$ python3 depth_anything_v2.py -i input.png -s output.png -ec vitl-i, -s, -ec options can be used to specify the
input path, save path, and encoder type separately.
Available encoder types: vits, vitb, vitl
$ python3 depth_anything_v2.py -v 0argument after the -v option can be the device id of the webcam,
or the path to the input video.
This model requires ailia SDK 1.2.16 and later.
Pytorch
ONNX opset=17
depth_anything_v2_depth_anything_v2_vits.onnx.prototxt depth_anything_v2_depth_anything_v2_vitb.onnx.prototxt depth_anything_v2_depth_anything_v2_vitl.onnx.prototxt

