Implement MetaCLIP 2 model by sineeli · Pull Request #2527 · keras-team/keras-hub

sineeli · 2026-01-19T02:40:02Z

Description of the change

This update introduces the MetaCLIP 2 model components, including the tokenizer, vision encoder, and image converter, along with necessary conversion utilities.

Key Points:

Architecture: MetaCLIP2 uses the same vanilla CLIP architecture as OpenAI CLIP; no architectural or tokenizer changes.
Data scale: Training data is expanded from English-only to worldwide data covering 300+ languages.
Curation: Language-aware curation using human-curated metadata with language-specific thresholds (t_lang), where each language preserves the same ~6% tail mass as English (t_en = 170k).
Seen pairs scaling: Global training exposure is scaled from 13B (1.0×) to 29B (2.3×) seen pairs to prevent English downsampling (English ≈ 44% of data).
Model capacity: Increasing capacity from ViT-L/14 to ViT-H/14 yields consistent gains and is necessary to learn worldwide-scale data.
Comparison: MetaCLIP2 outperforms prior multilingual CLIP models (mSigLIP, SigLIP2) using open data and 2.3× seen pairs, versus private WebLI data and 3.0× seen pairs.
For tokenizing multilingual data MetaCLIP 2 uses the XLM-V tokenizer (facebook/xlm-v-base) which is a multilingual SentencePiece BPE tokenizer with ~901K vocabulary supporting 100+ languages.

Reference

Paper: https://arxiv.org/pdf/2507.22062
HF: https://github.com/huggingface/transformers/tree/main/src/transformers/models/metaclip_2

Colab Notebook

No Colab notebook is provided.

Checklist

I have added all the necessary unit tests for my change.
I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
My PR is based on the latest changes of the main branch (if unsure, rebase the code).
I have followed the Keras Hub Model contribution guidelines in making these changes.
I have followed the Keras Hub API design guidelines in making these changes.
I have signed the Contributor License Agreement.

strucutre it better

…oder, and conversion utilities; add tests for tokenizer and conversion functions.

… classes, streamline pooling logic, and update text and vision encoders to use EOS token for pooling. Adjust checkpoint conversion to reflect changes in layer normalization application.

…g with ops.take_along_axis for improved handling of EOS token positions.

… and add image converter config loading function for preprocessing runtime loading of hf presets

…nable not required while transfer as well

gemini-code-assist · 2026-01-19T02:40:34Z

Summary of Changes

Hello @sineeli, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates the MetaCLIP 2 model into the Keras Hub, providing a comprehensive set of components for multimodal understanding. It includes the model's backbone, vision and text encoders, a multilingual tokenizer, and an image preprocessor. The integration allows Keras users to leverage MetaCLIP 2's advanced capabilities, which stem from its innovative training data curation and robust architecture, offering enhanced performance, especially in multilingual contexts, compared to earlier CLIP-like models.

Highlights

MetaCLIP 2 Model Integration: Introduces the MetaCLIP 2 model, including its tokenizer, vision encoder, image converter, and necessary conversion utilities, into the Keras Hub.
Enhanced Training Data Strategy: While sharing architecture with CLIP, MetaCLIP 2 distinguishes itself through a significantly larger and more diverse training dataset, featuring meticulously curated, handcrafted language thresholds for over 300 languages, rather than relying on LLMs for language identification.
Architectural Upgrade for Performance: The model incorporates a ViT-H/14 (Huge) vision transformer, a direct upgrade from the ViT-L/14 (Large) used in previous versions, leading to improved accuracy.
Multilingual Robustness: MetaCLIP 2 is designed to overcome the degradation issues observed in prior models like mSigLip and SigLip2 when trained on multilingual data, offering more robust performance.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sineeli · 2026-01-19T02:43:38Z

Will attach a colab example in few days

gemini-code-assist

Code Review

This pull request introduces the MetaCLIP 2 model, including its backbone, encoders, tokenizer, preprocessor, and associated utilities like conversion scripts and tests. The implementation is well-structured and largely adheres to the repository's style guide regarding file structure, naming conventions, and component design. My review focuses on improving the documentation for clarity, correcting examples to use valid presets, and completing an unfinished test case to ensure correctness. A key point from the contribution guidelines is the requirement for a Colab notebook to validate numerical equivalence with the original model; this appears to be missing and should be addressed.

keras_hub/src/models/metaclip_2/metaclip_2_preprocessor_test.py

keras_hub/src/models/metaclip_2/metaclip_2_backbone.py

keras_hub/src/models/metaclip_2/metaclip_2_causal_lm_preprocessor.py

keras_hub/src/models/metaclip_2/metaclip_2_text_encoder.py

keras_hub/src/models/metaclip_2/metaclip_2_tokenizer.py

keras_hub/src/models/metaclip_2/metaclip_2_vision_encoder.py

sineeli · 2026-01-29T04:09:51Z

Baisc Tutorial: https://www.kaggle.com/code/sravanneeli/metaclip2-inference-tutorial

sachinprasadhs

Thanks for the contribution. Code structure looks good.
Made some comments & suggestions please check.

keras_hub/src/models/metaclip_2/metaclip_2_backbone_test.py

keras_hub/src/models/metaclip_2/metaclip_2_backbone.py

keras_hub/src/models/metaclip_2/metaclip_2_backbone_test.py

keras_hub/src/models/metaclip_2/metaclip_2_image_converter.py

keras_hub/src/models/metaclip_2/metaclip_2_preprocessor.py

keras_hub/src/models/metaclip_2/metaclip_2_causal_lm_preprocessor.py

keras_hub/src/models/metaclip_2/metaclip_2_presets.py

keras_hub/src/utils/transformers/convert_metaclip_2.py

tools/checkpoint_conversion/convert_metaclip_2_checkpoints.py

…with causal LM variant convert metaclip2 checkpoint using on the fly hf preset way

sachinprasadhs

Thanks, looks good, only 2 small comments to address.
Could you also attach numerics validation colab Gist for the preset. Thanks

sachinprasadhs · 2026-02-23T23:09:41Z

keras_hub/src/models/metaclip_2/metaclip_2_causal_lm_preprocessor.py

+    Examples:
+    ```python
+    # Load the preprocessor from a preset.
+    preprocessor = keras_hub.models.MetaCLIP2Preprocessor.from_preset(


MetaCLIP2Preprocessor --> MetaCLIP2CausalLMPreprocessor

sachinprasadhs · 2026-02-24T19:12:46Z

keras_hub/src/utils/transformers/convert_metaclip_2_test.py

+        self.assertEqual(outputs["vision_logits"].shape, (1, 1))
+        self.assertEqual(outputs["text_logits"].shape, (1, 1))
+
+    @pytest.mark.large


Mark this as extra_large, the GPU tests are failing due to OOM.

sineeli and others added 15 commits January 17, 2026 23:47

Implement MetaCLIP 2 model components including tokenizer, vision enc…

761d20e

…oder, and conversion utilities; add tests for tokenizer and conversion functions.

Merge branch 'keras-team:master' into metaclip_2

78ae55e

Refactor MetaCLIP 2 backbone and encoder layers: remove unused pooler…

c039e9b

… classes, streamline pooling logic, and update text and vision encoders to use EOS token for pooling. Adjust checkpoint conversion to reflect changes in layer normalization application.

Refactor pooling logic in MetaCLIP2TextEncoder: replace batch indexin…

3c6cc0a

…g with ops.take_along_axis for improved handling of EOS token positions.

Remove outdated model descriptions from checkpoint conversion script.

e872a8c

Update MetaCLIP 2 preset configurations with correct parameter counts…

4106a75

… and add image converter config loading function for preprocessing runtime loading of hf presets

Remove uncessary weight declration of position ids which are not trai…

f1588f3

…nable not required while transfer as well

nit

7230f9b

add proper poolers so that shapes are detected properly

be0e220

remove wrong test cases

a0906f4

refactor: update compute_output_shape method parameters for consistency

9e76d7c

restructure the architecture a little bit for proper output shapes

9cb4748

fix minor naming issues

a954faf

for position ids use default arange

3e75bde

remove long test cases

7021228

gemini-code-assist bot reviewed Jan 19, 2026

View reviewed changes

fix gemini issues

288aa83

sachinprasadhs self-requested a review January 20, 2026 21:28

sineeli mentioned this pull request Jan 22, 2026

Add MetaClip2 #2393

Open

sachinprasadhs added the new model For PRs that contribute a new model to the Keras Hub registry. label Feb 9, 2026

sachinprasadhs reviewed Feb 13, 2026

View reviewed changes

sachinprasadhs added the stat:awaiting response from contributor label Feb 13, 2026

sineeli added 2 commits February 22, 2026 09:32

refactor: update MetaCLIP 2 model structure and replace preprocessor …

422be9f

…with causal LM variant convert metaclip2 checkpoint using on the fly hf preset way

nit

7bb9482

sachinprasadhs added kokoro:force-run Runs Tests on GPU and removed stat:awaiting response from contributor labels Feb 23, 2026

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Feb 23, 2026

sachinprasadhs reviewed Feb 24, 2026

View reviewed changes

Conversation

sineeli commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the change

Key Points:

Reference

Colab Notebook

Checklist

Uh oh!

gemini-code-assist bot commented Jan 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

sineeli commented Jan 19, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sineeli commented Jan 29, 2026

Uh oh!

sachinprasadhs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sachinprasadhs left a comment

Choose a reason for hiding this comment

Uh oh!

sachinprasadhs Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

sachinprasadhs Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sineeli commented Jan 19, 2026 •

edited

Loading