Skip to content

Results of Nemotron Embed VL 1B on Vidore V3.v2 tasks (page image + text)#432

Open
gabrielspmoreira wants to merge 4 commits intoembeddings-benchmark:mainfrom
gabrielspmoreira:nemotron_embed_vl_1b
Open

Results of Nemotron Embed VL 1B on Vidore V3.v2 tasks (page image + text)#432
gabrielspmoreira wants to merge 4 commits intoembeddings-benchmark:mainfrom
gabrielspmoreira:nemotron_embed_vl_1b

Conversation

@gabrielspmoreira
Copy link
Contributor

@gabrielspmoreira gabrielspmoreira commented Mar 6, 2026

This PR adds results of llama-nemotron-embed-vl-1b-v2 embedding model to MTEB Vidore V3.1 tasks. These 3.1 tasks provide both page image and text (OCR'ed) to the model.

Checklist

  • My model has a model sheet, report, or similar
  • My model has a reference implementation in mteb/models/model_implementations/nvidia_nemotron_vl_models.py
    • No, but there is an existing PR
  • The results submitted are obtained using the reference implementation
  • My model is available, either as a publicly accessible API or publicly on e.g., Huggingface
  • I solemnly swear that for all results submitted I have not trained on the evaluation dataset including training splits. If I have, I have disclosed it clearly.

@github-actions
Copy link

github-actions bot commented Mar 6, 2026

Model Results Comparison

Reference models: intfloat/multilingual-e5-large, google/gemini-embedding-001
New models evaluated: nvidia/llama-nemotron-embed-vl-1b-v2
Tasks: Vidore3ComputerScienceRetrieval.v2, Vidore3EnergyRetrieval.v2, Vidore3FinanceEnRetrieval.v2, Vidore3FinanceFrRetrieval.v2, Vidore3HrRetrieval.v2, Vidore3IndustrialRetrieval.v2, Vidore3PharmaceuticalsRetrieval.v2, Vidore3PhysicsRetrieval.v2

Results for nvidia/llama-nemotron-embed-vl-1b-v2

task_name nvidia/llama-nemotron-embed-vl-1b-v2 Max result Model with max result In Training Data
Vidore3ComputerScienceRetrieval.v2 0.6979 False
Vidore3EnergyRetrieval.v2 0.5867 False
Vidore3FinanceEnRetrieval.v2 0.5621 False
Vidore3FinanceFrRetrieval.v2 0.3731 False
Vidore3HrRetrieval.v2 0.5283 False
Vidore3IndustrialRetrieval.v2 0.3792 False
Vidore3PharmaceuticalsRetrieval.v2 0.6082 False
Vidore3PhysicsRetrieval.v2 0.4578 False
Average 0.5242 nan -


Note: Content truncated due to GitHub API limits. See the full report in the workflow artifacts.

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

results looks good - test fails model not being merged

…age_modality=True, use_text_modality=True) and experiment results with text-only (use_image_modality=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants