|
| 1 | +--- |
| 2 | +meta: |
| 3 | + title: Understanding the BGE-Multilingual-Gemma2 embedding model |
| 4 | + description: Deploy your own secure BGE-Multilingual-Gemma2 embedding model with Scaleway Managed Inference. Privacy-focused, fully managed. |
| 5 | +content: |
| 6 | + h1: Understanding the BGE-Multilingual-Gemma2 embedding model |
| 7 | + paragraph: This page provides information on the BGE-Multilingual-Gemma2 embedding model |
| 8 | +tags: embedding |
| 9 | +categories: |
| 10 | + - ai-data |
| 11 | +--- |
| 12 | + |
| 13 | +## Model overview |
| 14 | + |
| 15 | +| Attribute | Details | |
| 16 | +|-----------------|------------------------------------| |
| 17 | +| Provider | [baai](https://huggingface.co/BAAI) | |
| 18 | +| Compatible Instances | L4 (FP32) | |
| 19 | +| Context size | 4096 tokens | |
| 20 | + |
| 21 | +## Model name |
| 22 | + |
| 23 | +```bash |
| 24 | +baai/bge-multilingual-gemma2:fp32 |
| 25 | +``` |
| 26 | + |
| 27 | +## Compatible Instances |
| 28 | + |
| 29 | +| Instance type | Max context length | |
| 30 | +| ------------- |-------------| |
| 31 | +| L4 | 4096 (FP32) | |
| 32 | + |
| 33 | +## Model introduction |
| 34 | + |
| 35 | +BGE is short for BAAI General Embedding. This particular model is an LLM-based embedding, trained on a diverse range of languages and tasks from the lightweight [google/gemma-2-9b](https://huggingface.co/google/gemma-2-9b). |
| 36 | +As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/gemma/terms). |
| 37 | + |
| 38 | +## Why is it useful? |
| 39 | + |
| 40 | +- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) scoring #1 in french, #1 in polish, #7 in english, as of writing (Q4 2024). |
| 41 | +- As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more! |
| 42 | +- It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics. |
| 43 | +- BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications. |
| 44 | + |
| 45 | +## How to use it |
| 46 | + |
| 47 | +### Sending Managed Inference requests |
| 48 | + |
| 49 | +To perform inference tasks with your Embedding model deployed at Scaleway, use the following command: |
| 50 | + |
| 51 | +```bash |
| 52 | +curl https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/embeddings \ |
| 53 | + -H "Authorization: Bearer <IAM API key>" \ |
| 54 | + -H "Content-Type: application/json" \ |
| 55 | + -d '{ |
| 56 | + "input": "Embeddings can represent text in a numerical format.", |
| 57 | + "model": "baai/bge-multilingual-gemma2:fp32" |
| 58 | + }' |
| 59 | +``` |
| 60 | + |
| 61 | +Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. |
| 62 | + |
| 63 | +### Receiving Inference responses |
| 64 | + |
| 65 | +Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. |
| 66 | +Process the output data according to your application's needs. The response will contain the output generated by the embedding model based on the input provided in the request. |
0 commit comments