Discussion about : Improve OpenVINO training extensions classification component by introducing DinoV2 architecture with DoRA #29204
Replies: 14 comments 34 replies
-
Beta Was this translation helpful? Give feedback.
-
@kprokofi & @sovrasov , dear mentors I want the Initial step for the project. |
Beta Was this translation helpful? Give feedback.
-
Hi mentors @sovrasov, @kprokofi - Aakash here, keenly interested to contribute to this project, as part of GSoC 2025. I wasn't able to find a good first issue, either, related to this project, or, unassigned and requiring Python. Could you suggest an initial contribution for me too -
or, would you like to suggest something new? |
Beta Was this translation helpful? Give feedback.
-
Hi @sovrasov, |
Beta Was this translation helpful? Give feedback.
-
@sovrasov , @kprokofi Could you please assign me a task for my initial contribution, as it is a prerequisite for GSoC 2025 by OpenVINO? |
Beta Was this translation helpful? Give feedback.
-
Hello mentors @sovrasov and @kprokofi, I'm writing to express my interest in contributing to the OpenVINO Training Extensions (OTX) project . I'm a Sophomore(IIT (BHU) Varanasi)with experience in deep learning, Python, and computer vision architectures. I've worked with transformer models previously and am familiar with fine-tuning approaches. Based on the project description and your guidance, I understand the goals include: Some questions coming in my mind right now are: Looking forward to your response. |
Beta Was this translation helpful? Give feedback.
-
Hello, my name is Gyu Il Lim, and I am currently studying in ai convergence master's program at soongsil university in Korea. I’m conducting research on VLM lightweighting and fine-tuning, and I found an interesting project in OpenVINO that I would like to participate in, so I am leaving a comment! Reaching out regarding the GSoC 2025 project: Improve OpenVINO training extensions classification component by introducing DinoV2 architecture with DoRA. I apologize for late contact. Having read through this Discussion page, I have summarized the following points.
In conclusion, I think the project will focus on optimizing fine-tuning based on Backbone (e.g., DinoV2) + DoRA in PyTorch, and maintaining performance (FPS, Acc) after conversion to OpenVINO, while conducting ablation studies to measure performance. I have previous experience fine-tuning VLM with LoRA for a project, and currently researching PEFT methods such as LoRA, QLoRA, and DoRA to improve the performance of lightweight Vision Language Models, so I believe I can contribute to this project. First of all, I will implement the first Issue recommended. Since the target datasets are very small (20-1,000 images) size, tuning only part of the model will likely be more beneficial than full fine-tuning. Lastly, I have a few questions:
Thank you for your time and consideration. I look forward to your response and the possibility of contributing to this exciting project. Best regards!! |
Beta Was this translation helpful? Give feedback.
-
Hi @gyuilLim, welcome to the discussion! Great summary, more first issues are to come today, so you can choose.
|
Beta Was this translation helpful? Give feedback.
-
Few more good first issues: open-edge-platform/training_extensions#4288 |
Beta Was this translation helpful? Give feedback.
-
Dear @sovrasov and @kprokofi,
|
Beta Was this translation helpful? Give feedback.
-
Dear Mentors I'd like to contribute to this project, as part of GSoC 2025. |
Beta Was this translation helpful? Give feedback.
-
I hope you are doing well. My name is Saad Ather Ali, and I am excited about the Gesture Control with OpenVINO project for Google Summer of Code. With my experience in computer vision and gesture-based interaction, I believe I can contribute effectively to this project. I have been working on a project called Air Tracker ([GitHub Repository](https://github.com/saadkhi/AIR-TRACKER)), which is a hand gesture control system for project presentations. It enables users to interact with slides, multimedia, and on-screen elements using gesture recognition, eliminating the need for physical remotes or touch-based interfaces. My work on Air Tracker has given me hands-on experience with gesture-based navigation, media control, and custom gesture mapping, which aligns well with the objectives of this project. Interest in This Project Potential Contributions & Customization Efficient Model Optimization: Improve the performance of gesture recognition using OpenVINO optimizations. Looking forward to your guidance and the opportunity to work on this project! Best regards, |
Beta Was this translation helpful? Give feedback.
-
Hey @sovrasov , how much improvement do we need in the dino_v2 for classification? Is there a specific accuracy target or percentage increase we should aim for? |
Beta Was this translation helpful? Give feedback.
-
Hello @sovrasov and @kprokofi , My name is Kunal Tiwari. I am writing to express my interest in contributing to the OpenVINO Training Extensions project, specifically focusing on integrating the DinoV2 architecture with DoRA for image classification tasks. I am a student at the University of Texas at Austin, where I am currently studying Computer Science. I have experience in machine learning, computer vision, and transformer-based architectures! I am currently part of the Living with Robots lab on campus, where my team and I have been developing an SOTA transformer for human motion prediction. We are currently also writing a research paper to be published displaying our results. Outside of school, I’ve worked on projects like rebuilding the GPT-2 model from scratch, using CNNs for music genre classification, and deploying full-stack ML apps using PyTorch, TensorFlow, and Hugging Face tools. I’m especially excited about the DinoV2 + DoRA integration project, as it aligns closely with my background in both machine learning theory and hands-on development. I’ve built full-stack ML applications using vector databases, RAG pipelines, and NLP systems, and I’m eager to contribute to optimizing OTX for small to medium-sized datasets—empowering developers with more efficient and flexible tools. In addition, I’ve designed and trained custom models for a variety of tasks, including GANs for image generation and VAEs for motion synthesis, and have hands-on experience fine-tuning models to improve performance on specific objectives. Having looked through all the previous discussion and doing my own research, I have some clarifying questions I wanted to ask:
Looking forward to learning from the mentors and collaborating with the community! Thank you for your time, |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Subject: Interest in OpenVINO GSoC Project: DinoV2 + DoRA for OTX
Dear Vladislav Sovrasov & Kirill Prokofiev,
I am Faizan, a final-year Computer Science student with a strong background in Machine Learning, Python, and AI. I am deeply interested in contributing to OpenVINO's Training Extensions project for GSoC 2025, particularly integrating DinoV2 with DoRA fine-tuning.
To better understand the project, I have:
Explored DinoV2 and its potential for self-supervised learning.
Read about DoRA, which optimizes fine-tuning efficiency.
Reviewed OpenVINO's OTX repository and how models are currently integrated.
I would love to discuss:
Which specific use cases or datasets would be ideal for testing DinoV2 with DoRA?
How OpenVINO handles model fine-tuning (e.g., will DinoV2 need custom preprocessing in OTX)?
What initial contributions I can make before GSoC starts?
Would it be possible to schedule a quick call or discuss this on OpenVINO’s community forum? I want to ensure I fully understand the expectations before preparing my proposal.
Looking forward to your guidance.
Best regards,
Faizan
@adrianboguszewski could you please connect me with the mentors.
Beta Was this translation helpful? Give feedback.
All reactions