diff --git a/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx new file mode 100644 index 0000000000..b7cd211ea4 --- /dev/null +++ b/ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx @@ -0,0 +1,69 @@ +--- +meta: + title: Understanding the BGE-Multilingual-Gemma2 embedding model + description: Deploy your own secure BGE-Multilingual-Gemma2 embedding model with Scaleway Managed Inference. Privacy-focused, fully managed. +content: + h1: Understanding the BGE-Multilingual-Gemma2 embedding model + paragraph: This page provides information on the BGE-Multilingual-Gemma2 embedding model +tags: embedding +categories: +dates: + validation: 2024-10-30 + posted: 2024-10-30 + - ai-data +--- + +## Model overview + +| Attribute | Details | +|-----------------|------------------------------------| +| Provider | [baai](https://huggingface.co/BAAI) | +| Compatible Instances | L4 (FP32) | +| Context size | 4096 tokens | + +## Model name + +```bash +baai/bge-multilingual-gemma2:fp32 +``` + +## Compatible Instances + +| Instance type | Max context length | +| ------------- |-------------| +| L4 | 4096 (FP32) | + +## Model introduction + +BGE is short for BAAI General Embedding. This particular model is an LLM-based embedding, trained on a diverse range of languages and tasks from the lightweight [google/gemma-2-9b](https://huggingface.co/google/gemma-2-9b). +As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/gemma/terms). + +## Why is it useful? + +- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard), scoring the number one spot in French and Polish, and number seven in English, at the time of writing this page (Q4 2024). +- As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more. +- It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics. +- BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications. + +## How to use it + +### Sending Managed Inference requests + +To perform inference tasks with your embedding model deployed at Scaleway, use the following command: + +```bash +curl https://.ifr.fr-par.scaleway.com/v1/embeddings \ + -H "Authorization: Bearer " \ + -H "Content-Type: application/json" \ + -d '{ + "input": "Embeddings can represent text in a numerical format.", + "model": "baai/bge-multilingual-gemma2:fp32" + }' +``` + +Make sure to replace `` and `` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. + +### Receiving Inference responses + +Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. +Process the output data according to your application's needs. The response will contain the output generated by the embedding model based on the input provided in the request. diff --git a/menu/navigation.json b/menu/navigation.json index 6236883e1f..ed80f8b786 100644 --- a/menu/navigation.json +++ b/menu/navigation.json @@ -623,14 +623,14 @@ "label": "Mixtral-8x7b-instruct-v0.1 model", "slug": "mixtral-8x7b-instruct-v0.1" }, - { - "label": "WizardLM-70b-v1.0 model", - "slug": "wizardlm-70b-v1.0" - }, { "label": "Sentence-t5-xxl model", "slug": "sentence-t5-xxl" }, + { + "label": "BGE-Multilingual-Gemma2 model", + "slug": "bge-multilingual-gemma2" + }, { "label": "Pixtral-12b-2409 model", "slug": "pixtral-12b-2409"