Skip to content

Commit ad4e3a7

Browse files
tgenaitaybene2k1nerda-codesRoRoJ
authored
feat(inference): newest embedding model (#3917)
* feat(inference): newest embedding * feat(inference): edited menu * Apply suggestions from code review Co-authored-by: nerda-codes <[email protected]> * Update ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx Co-authored-by: Rowena Jones <[email protected]> * Update ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx Co-authored-by: Rowena Jones <[email protected]> * Update ai-data/managed-inference/reference-content/bge-multilingual-gemma2.mdx Co-authored-by: Rowena Jones <[email protected]> --------- Co-authored-by: Benedikt Rollik <[email protected]> Co-authored-by: nerda-codes <[email protected]> Co-authored-by: Rowena Jones <[email protected]>
1 parent 2871aa9 commit ad4e3a7

File tree

2 files changed

+73
-4
lines changed

2 files changed

+73
-4
lines changed
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
meta:
3+
title: Understanding the BGE-Multilingual-Gemma2 embedding model
4+
description: Deploy your own secure BGE-Multilingual-Gemma2 embedding model with Scaleway Managed Inference. Privacy-focused, fully managed.
5+
content:
6+
h1: Understanding the BGE-Multilingual-Gemma2 embedding model
7+
paragraph: This page provides information on the BGE-Multilingual-Gemma2 embedding model
8+
tags: embedding
9+
categories:
10+
dates:
11+
validation: 2024-10-30
12+
posted: 2024-10-30
13+
- ai-data
14+
---
15+
16+
## Model overview
17+
18+
| Attribute | Details |
19+
|-----------------|------------------------------------|
20+
| Provider | [baai](https://huggingface.co/BAAI) |
21+
| Compatible Instances | L4 (FP32) |
22+
| Context size | 4096 tokens |
23+
24+
## Model name
25+
26+
```bash
27+
baai/bge-multilingual-gemma2:fp32
28+
```
29+
30+
## Compatible Instances
31+
32+
| Instance type | Max context length |
33+
| ------------- |-------------|
34+
| L4 | 4096 (FP32) |
35+
36+
## Model introduction
37+
38+
BGE is short for BAAI General Embedding. This particular model is an LLM-based embedding, trained on a diverse range of languages and tasks from the lightweight [google/gemma-2-9b](https://huggingface.co/google/gemma-2-9b).
39+
As such, it is distributed under the [Gemma terms of use](https://ai.google.dev/gemma/terms).
40+
41+
## Why is it useful?
42+
43+
- BGE-Multilingual-Gemma2 tops the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard), scoring the number one spot in French and Polish, and number seven in English, at the time of writing this page (Q4 2024).
44+
- As its name suggests, the model's training data spans a broad range of languages, including English, Chinese, Polish, French, and more.
45+
- It encodes text into 3584-dimensional vectors, providing a very detailed representation of sentence semantics.
46+
- BGE-Multilingual-Gemma2 in its L4/FP32 configuration boats a high context length of 4096 tokens, particularly useful for ingesting data and building RAG applications.
47+
48+
## How to use it
49+
50+
### Sending Managed Inference requests
51+
52+
To perform inference tasks with your embedding model deployed at Scaleway, use the following command:
53+
54+
```bash
55+
curl https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/embeddings \
56+
-H "Authorization: Bearer <IAM API key>" \
57+
-H "Content-Type: application/json" \
58+
-d '{
59+
"input": "Embeddings can represent text in a numerical format.",
60+
"model": "baai/bge-multilingual-gemma2:fp32"
61+
}'
62+
```
63+
64+
Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/identity-and-access-management/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting.
65+
66+
### Receiving Inference responses
67+
68+
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server.
69+
Process the output data according to your application's needs. The response will contain the output generated by the embedding model based on the input provided in the request.

menu/navigation.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -623,14 +623,14 @@
623623
"label": "Mixtral-8x7b-instruct-v0.1 model",
624624
"slug": "mixtral-8x7b-instruct-v0.1"
625625
},
626-
{
627-
"label": "WizardLM-70b-v1.0 model",
628-
"slug": "wizardlm-70b-v1.0"
629-
},
630626
{
631627
"label": "Sentence-t5-xxl model",
632628
"slug": "sentence-t5-xxl"
633629
},
630+
{
631+
"label": "BGE-Multilingual-Gemma2 model",
632+
"slug": "bge-multilingual-gemma2"
633+
},
634634
{
635635
"label": "Pixtral-12b-2409 model",
636636
"slug": "pixtral-12b-2409"

0 commit comments

Comments
 (0)