Skip to content

Commit 17b2a33

Browse files
committed
feat(inference): newest embedding
1 parent 5d47eb8 commit 17b2a33

File tree

1 file changed

+66
-0
lines changed

1 file changed

+66
-0
lines changed
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
meta:
3+
title: Understanding the BGE-Multilingual-Gemma2 embedding model
4+
description: Deploy your own secure BGE-Multilingual-Gemma2 embedding model with Scaleway Managed Inference. Privacy-focused, fully managed.
5+
content:
6+
h1: Understanding the BGE-Multilingual-Gemma2 embedding model
7+
paragraph: This page provides information on the BGE-Multilingual-Gemma2 embedding model
8+
tags: embedding
9+
categories:
10+
- ai-data
11+
---
12+
13+
## Model overview
14+
15+
| Attribute | Details |
16+
|-----------------|------------------------------------|
17+
| Provider | [baai](https://huggingface.co/BAAI) |
18+
| Compatible Instances | L4 (FP32) |
19+
| Context size | 4096 tokens |
20+
21+
## Model name
22+
23+
```bash
24+
baai/bge-multilingual-gemma2:fp32
25+
```
26+
27+
## Compatible Instances
28+
29+
| Instance type | Max context length |
30+
| ------------- |-------------|
31+
| L4 | 4096 (FP32) |
32+
33+
## Model introduction
34+
35+
BGE is short for BAAI General Embedding. This particular model is an LLM-based embedding, trained on a diverse range of languages and tasks from the lightweight [google/gemma-2-9b](https://huggingface.co/google/gemma-2-9b).
36+
As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/gemma/terms).
37+
38+
## Why is it useful?
39+
40+
- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) scoring #1 in french, #1 in polish, #7 in english, as of writing (Q4 2024).
41+
- As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more!
42+
- It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics.
43+
- BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications.
44+
45+
## How to use it
46+
47+
### Sending Managed Inference requests
48+
49+
To perform inference tasks with your Embedding model deployed at Scaleway, use the following command:
50+
51+
```bash
52+
curl https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/embeddings \
53+
-H "Authorization: Bearer <IAM API key>" \
54+
-H "Content-Type: application/json" \
55+
-d '{
56+
"input": "Embeddings can represent text in a numerical format.",
57+
"model": "baai/bge-multilingual-gemma2:fp32"
58+
}'
59+
```
60+
61+
Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting.
62+
63+
### Receiving Inference responses
64+
65+
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
66+
Process the output data according to your application's needs. The response will contain the output generated by the embedding model based on the input provided in the request.

0 commit comments

Comments
 (0)