Skip to content

Commit b9237a2

Browse files
Add DINOv2 to Docs with Example (#1849)
* add docs * update lightning examples * update pytorch examples
1 parent 92e53dc commit b9237a2

File tree

8 files changed

+2199
-0
lines changed

8 files changed

+2199
-0
lines changed

docs/source/examples/dinov2.rst

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
.. _dinov2:
2+
3+
DINOv2
4+
======
5+
6+
DINOv2 (DIstillation with NO labels v2) [0]_ is an advanced self-supervised learning framework developed by Meta AI for robust visual representation learning without labeled data. Extending the original DINO [1]_ approach, DINOv2 trains a student network to match outputs from a momentum-averaged teacher network. By leveraging self-distillation objectives at both image and patch levels, enhances both global and local feature learning. Combined with other various innovations in both the training recipe and efficient training implementation, DINOv2 exhibits state-of-the-art performance across various computer vision tasks, including classification, segmentation, and depth estimation, without the necessity for task-specific fine-tuning.
7+
8+
Key Components
9+
--------------
10+
11+
- **Multi-level Objectives**: DINOv2 employs DINO loss for the image-level objective and iBOT [2]_ loss for patch-level objective. This multi-level approach enhances both global and local feature representations, significantly improving performance on dense prediction tasks like segmentation and depth estimation.
12+
- **KoLeo Regularizer**: DINOv2 introduces the KoLeo regularizer [3]_, which promotes uniform spreading of features within a batch, significantly enhancing the quality of nearest-neighbor retrieval tasks without negatively affecting performance on dense downstream tasks.
13+
14+
Good to Know
15+
------------
16+
17+
- **SOTA out-of-the-box**: DINOv2 currently represents the state-of-the-art (SOTA) among self-supervised learning (SSL) methods in computer vision, outperforming existing frameworks in various benchmarks.
18+
- **Relation to other SSL methods**: DINOv2 can be seen as a combination of DINO and iBOT losses with the centering of SwAV [4]_.
19+
20+
Reference:
21+
22+
.. [0] `DINOv2: Learning Robust Visual Features without Supervision, 2023 <https://arxiv.org/abs/2304.07193>`_
23+
.. [1] `Emerging Properties in Self-Supervised Vision Transformers, 2021 <https://arxiv.org/abs/2104.14294>`_
24+
.. [2] `iBOT: Image BERT Pre-Training with Online Tokenizer, 2021 <https://arxiv.org/abs/2111.07832>`_
25+
.. [3] `Spreading vectors for similarity search, 2018 <https://arxiv.org/abs/1806.03198>`_
26+
.. [4] `Unsupervised Learning of Visual Features by Contrasting Cluster Assignments, 2020 <https://arxiv.org/abs/2006.09882>`_
27+
28+
29+
.. tabs::
30+
.. tab:: PyTorch
31+
32+
.. image:: https://img.shields.io/badge/Open%20in%20Colab-blue?logo=googlecolab&label=%20&labelColor=5c5c5c
33+
:target: https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/dinov2.ipynb
34+
35+
This example can be run from the command line with::
36+
37+
python lightly/examples/pytorch/dinov2.py
38+
39+
.. literalinclude:: ../../../examples/pytorch/dinov2.py
40+
41+
.. tab:: Lightning
42+
43+
.. image:: https://img.shields.io/badge/Open%20in%20Colab-blue?logo=googlecolab&label=%20&labelColor=5c5c5c
44+
:target: https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch_lightning/dinov2.ipynb
45+
46+
This example can be run from the command line with::
47+
48+
python lightly/examples/pytorch_lightning/dinov2.py
49+
50+
.. literalinclude:: ../../../examples/pytorch_lightning/dinov2.py
51+
52+
.. tab:: Lightning Distributed
53+
54+
.. image:: https://img.shields.io/badge/Open%20in%20Colab-blue?logo=googlecolab&label=%20&labelColor=5c5c5c
55+
:target: https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch_lightning_distributed/dinov2.ipynb
56+
57+
This example runs on multiple gpus using Distributed Data Parallel (DDP)
58+
training with Pytorch Lightning. At least one GPU must be available on
59+
the system. The example can be run from the command line with::
60+
61+
python lightly/examples/pytorch_lightning_distributed/dinov2.py
62+
63+
The model differs in the following ways from the non-distributed
64+
implementation:
65+
66+
- Distributed Data Parallel is enabled
67+
- Synchronized Batch Norm is used in place of standard Batch Norm
68+
- Distributed Sampling is used in the dataloader
69+
70+
Note that Synchronized Batch Norm is optional and the model can also be
71+
trained without it. Without Synchronized Batch Norm the batch norm for
72+
each GPU is only calculated based on the features on that specific GPU.
73+
Distributed Sampling makes sure that each distributed process sees only
74+
a subset of the data.
75+
76+
.. literalinclude:: ../../../examples/pytorch_lightning_distributed/dinov2.py

docs/source/examples/models.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ for PyTorch and PyTorch Lightning to give you a headstart when implementing your
1616
dcl.rst
1717
densecl.rst
1818
dino.rst
19+
dinov2.rst
1920
fastsiam.rst
2021
mae.rst
2122
mmcr.rst

0 commit comments

Comments
 (0)