Skip to content

Commit a9d1e7d

Browse files
authored
Add support for DINOv3 (#1390)
1 parent 8d6c400 commit a9d1e7d

File tree

5 files changed

+23
-0
lines changed

5 files changed

+23
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
318318
1. **[DETR](https://huggingface.co/docs/transformers/model_doc/detr)** (from Facebook) released with the paper [End-to-End Object Detection with Transformers](https://huggingface.co/papers/2005.12872) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.
319319
1. **[DINOv2](https://huggingface.co/docs/transformers/model_doc/dinov2)** (from Meta AI) released with the paper [DINOv2: Learning Robust Visual Features without Supervision](https://huggingface.co/papers/2304.07193) by Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski.
320320
1. **[DINOv2 with Registers](https://huggingface.co/docs/transformers/model_doc/dinov2_with_registers)** (from Meta AI) released with the paper [DINOv2 with Registers](https://huggingface.co/papers/2309.16588) by Timothée Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski.
321+
1. **[DINOv3](https://huggingface.co/docs/transformers/model_doc/dinov3)** (from Meta AI) released with the paper [DINOv3](https://huggingface.co/papers/2508.10104) by Oriane Siméoni, Huy V. Vo, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timothée Darcet, Théo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie, Julien Mairal, Hervé Jégou, Patrick Labatut, Piotr Bojanowski.
321322
1. **[DistilBERT](https://huggingface.co/docs/transformers/model_doc/distilbert)** (from HuggingFace), released together with the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://huggingface.co/papers/1910.01108) by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into [DistilGPT2](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), RoBERTa into [DistilRoBERTa](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), Multilingual BERT into [DistilmBERT](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) and a German version of DistilBERT.
322323
1. **[DiT](https://huggingface.co/docs/transformers/model_doc/dit)** (from Microsoft Research) released with the paper [DiT: Self-supervised Pre-training for Document Image Transformer](https://huggingface.co/papers/2203.02378) by Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei.
323324
1. **[Donut](https://huggingface.co/docs/transformers/model_doc/donut)** (from NAVER), released together with the paper [OCR-free Document Understanding Transformer](https://huggingface.co/papers/2111.15664) by Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park.

docs/snippets/6_supported-models.snippet

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
1. **[DETR](https://huggingface.co/docs/transformers/model_doc/detr)** (from Facebook) released with the paper [End-to-End Object Detection with Transformers](https://huggingface.co/papers/2005.12872) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko.
3333
1. **[DINOv2](https://huggingface.co/docs/transformers/model_doc/dinov2)** (from Meta AI) released with the paper [DINOv2: Learning Robust Visual Features without Supervision](https://huggingface.co/papers/2304.07193) by Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski.
3434
1. **[DINOv2 with Registers](https://huggingface.co/docs/transformers/model_doc/dinov2_with_registers)** (from Meta AI) released with the paper [DINOv2 with Registers](https://huggingface.co/papers/2309.16588) by Timothée Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski.
35+
1. **[DINOv3](https://huggingface.co/docs/transformers/model_doc/dinov3)** (from Meta AI) released with the paper [DINOv3](https://huggingface.co/papers/2508.10104) by Oriane Siméoni, Huy V. Vo, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timothée Darcet, Théo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie, Julien Mairal, Hervé Jégou, Patrick Labatut, Piotr Bojanowski.
3536
1. **[DistilBERT](https://huggingface.co/docs/transformers/model_doc/distilbert)** (from HuggingFace), released together with the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://huggingface.co/papers/1910.01108) by Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into [DistilGPT2](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), RoBERTa into [DistilRoBERTa](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation), Multilingual BERT into [DistilmBERT](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) and a German version of DistilBERT.
3637
1. **[DiT](https://huggingface.co/docs/transformers/model_doc/dit)** (from Microsoft Research) released with the paper [DiT: Self-supervised Pre-training for Document Image Transformer](https://huggingface.co/papers/2203.02378) by Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei.
3738
1. **[Donut](https://huggingface.co/docs/transformers/model_doc/donut)** (from NAVER), released together with the paper [OCR-free Document Understanding Transformer](https://huggingface.co/papers/2111.15664) by Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park.

src/models.js

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5867,6 +5867,18 @@ export class Dinov2WithRegistersForImageClassification extends Dinov2WithRegiste
58675867
return new SequenceClassifierOutput(await super._call(model_inputs));
58685868
}
58695869
}
5870+
//////////////////////////////////////////////////
5871+
5872+
//////////////////////////////////////////////////
5873+
export class DINOv3ViTPreTrainedModel extends PreTrainedModel { }
5874+
export class DINOv3ViTModel extends DINOv3ViTPreTrainedModel { }
5875+
//////////////////////////////////////////////////
5876+
5877+
//////////////////////////////////////////////////
5878+
export class DINOv3ConvNextPreTrainedModel extends PreTrainedModel { }
5879+
export class DINOv3ConvNextModel extends DINOv3ConvNextPreTrainedModel { }
5880+
//////////////////////////////////////////////////
5881+
58705882
//////////////////////////////////////////////////
58715883
export class GroundingDinoPreTrainedModel extends PreTrainedModel { }
58725884
export class GroundingDinoForObjectDetection extends GroundingDinoPreTrainedModel { }
@@ -7772,6 +7784,8 @@ const MODEL_MAPPING_NAMES_ENCODER_ONLY = new Map([
77727784
['convnextv2', ['ConvNextV2Model', ConvNextV2Model]],
77737785
['dinov2', ['Dinov2Model', Dinov2Model]],
77747786
['dinov2_with_registers', ['Dinov2WithRegistersModel', Dinov2WithRegistersModel]],
7787+
['dinov3_vit', ['DINOv3ViTModel', DINOv3ViTModel]],
7788+
['dinov3_convnext', ['DINOv3ConvNextModel', DINOv3ConvNextModel]],
77757789
['resnet', ['ResNetModel', ResNetModel]],
77767790
['swin', ['SwinModel', SwinModel]],
77777791
['swin2sr', ['Swin2SRModel', Swin2SRModel]],
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
2+
import {
3+
ImageProcessor,
4+
} from "../../base/image_processors_utils.js";
5+
6+
export class DINOv3ViTImageProcessor extends ImageProcessor { }

src/models/image_processors.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ export * from './clip/image_processing_clip.js'
66
export * from './convnext/image_processing_convnext.js'
77
export * from './deit/image_processing_deit.js'
88
export * from './detr/image_processing_detr.js'
9+
export * from './dinov3_vit/image_processing_dinov3_vit.js'
910
export * from './donut/image_processing_donut.js'
1011
export * from './dpt/image_processing_dpt.js'
1112
export * from './efficientnet/image_processing_efficientnet.js'

0 commit comments

Comments
 (0)