You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**`BGE-Code-v1 <https://huggingface.co/BAAI/bge-code-v1>`_** is an LLM-based code embedding model that supports code retrieval, text retrieval, and multilingual retrieval. It primarily demonstrates the following capabilities:
5
+
- Superior Code Retrieval Performance: The model demonstrates exceptional code retrieval capabilities, supporting natural language queries in both English and Chinese, as well as 20 programming languages.
6
+
- Robust Text Retrieval Capabilities: The model maintains strong text retrieval capabilities comparable to text embedding models of similar scale.
7
+
- Extensive Multilingual Support: BGE-Code-v1 offers comprehensive multilingual retrieval capabilities, excelling in languages such as English, Chinese, Japanese, French, and more.
| `BAAI/bge-code-v1 <https://huggingface.co/BAAI/bge-code-v1>`_ | Multilingual | 1.5B | 6.18 GB | SOTA code retrieval model, with exceptional multilingual text retrieval performance as well |
| `BAAI/bge-vl-MLLM-S2 <https://huggingface.co/BAAI/BGE-VL-MLLM-S2>`_ | English | 7.57B | 15.14 GB | Finetune BGE-VL-MLLM-S1 with one epoch on MMEB training set |
| `BAAI/BGE-VL-v1.5-zs <https://huggingface.co/BAAI/BGE-VL-v1.5-zs>`_ | English | 7.57B | 15.14 GB | Better multi-modal retrieval model with performs well in all kinds of tasks |
20
+
| `BAAI/BGE-VL-v1.5-mmeb <https://huggingface.co/BAAI/BGE-VL-v1.5-mmeb>`_ | English | 7.57B | 15.14 GB | Better multi-modal retrieval model, additionally fine-tuned on MMEB training set |
19
21
20
22
21
23
BGE-VL-CLIP
@@ -107,4 +109,50 @@ The normalized last hidden state of the [EOS] token in the MLLM is used as the e
107
109
print(scores)
108
110
109
111
112
+
BGE-VL-v1.5
113
+
-----------
114
+
115
+
BGE-VL-v1.5 series is the updated version of BGE-VL, bringing better performance on both retrieval and multi-modal understanding. The models were trained on 30M MegaPairs data and extra 10M natural and synthetic data.
116
+
117
+
`bge-vl-v1.5-zs` is a zero-shot model, only trained on the data mentioned above. `bge-vl-v1.5-mmeb` is the fine-tuned version on MMEB training set.
model = AutoModel.from_pretrained(MODEL_NAME, trust_remote_code=True)
129
+
model.eval()
130
+
model.cuda()
131
+
132
+
with torch.no_grad():
133
+
model.set_processor(MODEL_NAME)
134
+
135
+
query_inputs = model.data_process(
136
+
text="Make the background dark, as if the camera has taken the photo at night",
137
+
images="../../imgs/cir_query.png",
138
+
q_or_c="q",
139
+
task_instruction="Retrieve the target image that best meets the combined criteria by using both the provided image and the image retrieval instructions: "
0 commit comments