Skip to content

Conversation

Raghavyadav17
Copy link

@Raghavyadav17 Raghavyadav17 commented Jun 23, 2025

Hi @dkalinowski , @dtrawins
Please review this PR

What’s Included

  • convert_clip.py: Converts HuggingFace CLIP vision model to OpenVINO IR (.xml/.bin)
  • pre.py: Preprocessing handler for converting images into model-ready tensors
  • post.py: Postprocessing handler to normalize embeddings
  • config.json: OVMS configuration for model + mediapipe graph
  • grph_pipeline.pbtxt: Mediapipe graph defining the inference pipeline

@mzegla mzegla added the GSoC Contributions that are part of Google Summer of Code projects label Jun 25, 2025
@@ -396,8 +396,8 @@ RUN apt-get update ; \
curl -L -O https://github.com/intel/linux-npu-driver/releases/download/v1.16.0/intel-driver-compiler-npu_1.16.0.20250328-14132024782_ubuntu24.04_amd64.deb ; \
curl -L -O https://github.com/intel/linux-npu-driver/releases/download/v1.16.0/intel-fw-npu_1.16.0.20250328-14132024782_ubuntu24.04_amd64.deb ; \
curl -L -O https://github.com/intel/linux-npu-driver/releases/download/v1.16.0/intel-level-zero-npu_1.16.0.20250328-14132024782_ubuntu24.04_amd64.deb ; \
curl -L -O https://github.com/oneapi-src/level-zero/releases/download/v1.20.2/level-zero_1.20.2+u24.04_amd64.deb ; \
fi ; \
curl -L -O https://github.com/oneapi-src/level-zero/releases/download/v1.20.2/level-zero_1.20.2+u24.04_amd64.deb ; \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't edit Dockerfile.ubuntu in this pull request (this makes conflicts and is really not required)

end_time = datetime.datetime.now()
duration = (end_time - start_time).total_seconds() * 1000
processing_times.append(int(duration))
print(f"Detection:\n{results.as_numpy('embedding')}\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(f"Detection:\n{results.as_numpy('embedding')}\n")
print(f"Embeddings:\n{results.as_numpy('embedding')}\n")


processing_times = []
start_time = datetime.datetime.now()
results = client.infer("python_model", [image_input])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
results = client.infer("python_model", [image_input])
results = client.infer(args['model'], [image_input])

please add --model CLI param. with this, user will not need to modify client code, but simply select model with --model dino_graph for example. you can keep some default for easier of use. im sure current python_model will not work, since you serve:

  • clip_graph
  • dino_graph
  • laion_graph

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkalinowski sure i will do this

def initialize(self, kwargs: dict):
self.node_name = kwargs.get("node_name", "")
if "clip" in self.node_name.lower():
self.mode = "clip"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is self.mode really needed?


if "clip" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode="clip"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is self.mode really needed?

@dkalinowski dkalinowski requested a review from Copilot August 29, 2025 12:12
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a complete image embeddings demo system for OpenVINO Model Server (OVMS) with support for multiple AI models (CLIP, DINO, LAION) and vector similarity search capabilities.

  • Implementation of MediaPipe graph pipelines for CLIP, DINO, and LAION image embedding models
  • Python preprocessing and postprocessing handlers for image transformation and embedding normalization
  • Command-line tools for database building and image search functionality
  • Interactive Streamlit web application for visual similarity search

Reviewed Changes

Copilot reviewed 14 out of 122 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
streamlit_app.py Complete Streamlit web interface for image similarity search with model selection
pre.py Image preprocessing handler supporting multiple model types via transformers
post.py Embedding normalization postprocessing handler
graph_*.pbtxt MediaPipe pipeline configurations for CLIP, DINO, and LAION models
config_model.json OVMS configuration defining model paths and graph endpoints
search_images.py Command-line image search utility with performance metrics
requirements.txt Python dependencies with version constraints
model_conversion/*.py Scripts for converting HuggingFace models to OpenVINO format
grpc_cli.py Database building tool with interactive model selection

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@@ -0,0 +1,33 @@
from pyovms import Tensor
from transformers import CLIPProcessor,AutoImageProcessor
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after comma in import statement. Should be CLIPProcessor, AutoImageProcessor.

Suggested change
from transformers import CLIPProcessor,AutoImageProcessor
from transformers import CLIPProcessor, AutoImageProcessor

Copilot uses AI. Check for mistakes.


class OvmsPythonModel:
def initialize(self, kwargs:dict):
self.node_name=kwargs.get("node_name","")
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operator and after comma. Should be self.node_name = kwargs.get(\"node_name\", \"\").

Suggested change
self.node_name=kwargs.get("node_name","")
self.node_name = kwargs.get("node_name", "")

Copilot uses AI. Check for mistakes.

Comment on lines +10 to +22
self.node_name=kwargs.get("node_name","")

if "clip" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode="clip"

elif "dino" in self.node_name.lower():
self.processor=AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode="dino"

elif "laion" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode="laion"
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators throughout the method. Should be self.processor = ... and self.mode = ....

Suggested change
self.node_name=kwargs.get("node_name","")
if "clip" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode="clip"
elif "dino" in self.node_name.lower():
self.processor=AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode="dino"
elif "laion" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode="laion"
self.node_name = kwargs.get("node_name", "")
if "clip" in self.node_name.lower():
self.processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode = "clip"
elif "dino" in self.node_name.lower():
self.processor = AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode = "dino"
elif "laion" in self.node_name.lower():
self.processor = CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode = "laion"

Copilot uses AI. Check for mistakes.

Comment on lines +10 to +22
self.node_name=kwargs.get("node_name","")

if "clip" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode="clip"

elif "dino" in self.node_name.lower():
self.processor=AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode="dino"

elif "laion" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode="laion"
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators throughout the method. Should be self.processor = ... and self.mode = ....

Suggested change
self.node_name=kwargs.get("node_name","")
if "clip" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode="clip"
elif "dino" in self.node_name.lower():
self.processor=AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode="dino"
elif "laion" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode="laion"
self.node_name = kwargs.get("node_name", "")
if "clip" in self.node_name.lower():
self.processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode = "clip"
elif "dino" in self.node_name.lower():
self.processor = AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode = "dino"
elif "laion" in self.node_name.lower():
self.processor = CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode = "laion"

Copilot uses AI. Check for mistakes.

Comment on lines +10 to +22
self.node_name=kwargs.get("node_name","")

if "clip" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode="clip"

elif "dino" in self.node_name.lower():
self.processor=AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode="dino"

elif "laion" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode="laion"
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators throughout the method. Should be self.processor = ... and self.mode = ....

Suggested change
self.node_name=kwargs.get("node_name","")
if "clip" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode="clip"
elif "dino" in self.node_name.lower():
self.processor=AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode="dino"
elif "laion" in self.node_name.lower():
self.processor=CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode="laion"
self.node_name = kwargs.get("node_name", "")
if "clip" in self.node_name.lower():
self.processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
self.mode = "clip"
elif "dino" in self.node_name.lower():
self.processor = AutoImageProcessor.from_pretrained("facebook/dinov2-base")
self.mode = "dino"
elif "laion" in self.node_name.lower():
self.processor = CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")
self.mode = "laion"

Copilot uses AI. Check for mistakes.

Comment on lines 6 to 21
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")

model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)

image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]

print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")

mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators and after commas throughout the file. Should follow PEP 8 spacing conventions.

Suggested change
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)
image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")
mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
model_id = "facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model = AutoModel.from_pretrained(model_id)
processor = AutoImageProcessor.from_pretrained(model_id)
image = Image.new("RGB", (224, 224))
inputs = processor(images=image, return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model = ov.convert_model(model, example_input=inputs)
ov.save_model(ov_model, "dino_image_encoder.xml")
print("Model saved!")
mod_path = "saved_mod/dino/1"
os.makedirs(mod_path, exist_ok=True)

Copilot uses AI. Check for mistakes.

Comment on lines 6 to 21
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")

model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)

image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]

print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")

mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators and after commas throughout the file. Should follow PEP 8 spacing conventions.

Suggested change
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)
image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")
mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
model_id = "facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model = AutoModel.from_pretrained(model_id)
processor = AutoImageProcessor.from_pretrained(model_id)
image = Image.new("RGB", (224, 224))
inputs = processor(images=image, return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model = ov.convert_model(model, example_input=inputs)
ov.save_model(ov_model, "dino_image_encoder.xml")
print("Model saved!")
mod_path = "saved_mod/dino/1"
os.makedirs(mod_path, exist_ok=True)

Copilot uses AI. Check for mistakes.

Comment on lines 6 to 21
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")

model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)

image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]

print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")

mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators and after commas throughout the file. Should follow PEP 8 spacing conventions.

Suggested change
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)
image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")
mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
model_id = "facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model = AutoModel.from_pretrained(model_id)
processor = AutoImageProcessor.from_pretrained(model_id)
image = Image.new("RGB", (224, 224))
inputs = processor(images=image, return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model = ov.convert_model(model, example_input=inputs)
ov.save_model(ov_model, "dino_image_encoder.xml")
print("Model saved!")
mod_path = "saved_mod/dino/1"
os.makedirs(mod_path, exist_ok=True)

Copilot uses AI. Check for mistakes.

Comment on lines 6 to 21
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")

model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)

image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]

print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")

mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators and after commas throughout the file. Should follow PEP 8 spacing conventions.

Suggested change
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)
image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")
mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
model_id = "facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model = AutoModel.from_pretrained(model_id)
processor = AutoImageProcessor.from_pretrained(model_id)
image = Image.new("RGB", (224, 224))
inputs = processor(images=image, return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model = ov.convert_model(model, example_input=inputs)
ov.save_model(ov_model, "dino_image_encoder.xml")
print("Model saved!")
mod_path = "saved_mod/dino/1"
os.makedirs(mod_path, exist_ok=True)

Copilot uses AI. Check for mistakes.

Comment on lines 6 to 21
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")

model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)

image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]

print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")

mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around assignment operators and after commas throughout the file. Should follow PEP 8 spacing conventions.

Suggested change
model_id="facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model=AutoModel.from_pretrained(model_id)
processor=AutoImageProcessor.from_pretrained(model_id)
image=Image.new("RGB",(224,224))
inputs=processor(images=image,return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model=ov.convert_model(model,example_input=inputs)
ov.save_model(ov_model,"dino_image_encoder.xml")
print("Model saved!")
mod_path="saved_mod/dino/1"
os.makedirs(mod_path,exist_ok=True)
model_id = "facebook/dinov2-base"
print(f"Downloading pretrained model {model_id}...")
model = AutoModel.from_pretrained(model_id)
processor = AutoImageProcessor.from_pretrained(model_id)
image = Image.new("RGB", (224, 224))
inputs = processor(images=image, return_tensors="pt")["pixel_values"]
print("Converting models...")
ov_model = ov.convert_model(model, example_input=inputs)
ov.save_model(ov_model, "dino_image_encoder.xml")
print("Model saved!")
mod_path = "saved_mod/dino/1"
os.makedirs(mod_path, exist_ok=True)

Copilot uses AI. Check for mistakes.

@Raghavyadav17
Copy link
Author

@dkalinowski please review the final pushed code

@Raghavyadav17
Copy link
Author

@dkalinowski both the changes are applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GSoC Contributions that are part of Google Summer of Code projects
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants