You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Tutorial: Generating Embeddings Using a Local Asymmetric Embedding Model in OpenSearch
2
2
3
-
This tutorial shows how to generate embeddings using a local asymmetric embedding model in OpenSearch implemented in a Docker container.
3
+
This tutorial demonstrates how to generate text embeddings using an asymmetric embedding model in OpenSearch, implemented within a Docker container. The example model used in this tutorial is the multilingual `intfloat/multilingual-e5-small` model from Hugging Face. You will learn how to prepare the model, register it in OpenSearch, and run inference to generate embeddings.
4
4
5
-
Note: Replace the placeholders that start with `your_` with your own values.
5
+
> **Note**: Make sure to replace all placeholders (e.g., `your_`) with your specific values.
6
6
7
-
# Steps
7
+
---
8
8
9
-
## 1. Spin up a docker OpenSearch Cluster
9
+
## Prerequisites
10
10
11
-
With docker you can create a multi node cluster follow this docker-compose file as an example https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development
12
-
.Make sure you have docker desktop installed
11
+
- Docker Desktop installed and running on your local machine.
12
+
- Basic familiarity with Docker and OpenSearch.
13
+
- Access to the Hugging Face model `intfloat/multilingual-e5-small` (or another model of your choice).
14
+
---
15
+
16
+
## Step 1: Spin Up a Docker OpenSearch Cluster
17
+
18
+
To run OpenSearch in a local development environment, you can use Docker and a pre-configured `docker-compose` file.
19
+
20
+
### a. Update Cluster Settings
21
+
22
+
Before proceeding, ensure your cluster is configured to allow registering models. You can do this by updating the cluster settings via the following request:
13
23
14
-
### a. Update cluster settings
15
-
The current step uses a docker-compose file that uses two opensearch Non-ML nodes. This requires you to update cluster settings
16
-
so you can run a model.
17
24
```
18
25
PUT _cluster/settings
19
26
{
@@ -26,48 +33,96 @@ PUT _cluster/settings
26
33
}
27
34
```
28
35
29
-
### b. Use a docker compose file
30
-
31
-
Now that you have the docker-compose file you can use it by having docker dektop running in the background and then running
32
-
the command (within the same directory of the compose file) `docker-compose up -d` which will start opensearch in the background.
33
-
34
-
## 2. Prepare the model for OpenSearch
35
-
In this tutorial you will use a Hugging Face intfloat/multilingual-e5-small model (https://huggingface.co/intfloat/multilingual-e5-small) an asymmetric
36
-
text embedding model capable of handling different languages.
37
-
38
-
### a. Clone the model
39
-
You can find the steps within the models homepage click on the three dots just left of the train button. Then click **clone repository**
40
-
for this specific tutorial you will have to execture teh following. Making sure you find a approriate place to host the model.
In order to send the OpenSearch the embedding model make sure to zip the model contents more specifically you will need to zip the following
45
-
items in the directory that has the items `model.onnx, sentencepiece.bpe.model, tokenizer.json`. The **model.onnx** file is found within the
46
-
onnx directory of the repository you cloned. Now that you have the contents run the following in the relevant directory `zip -r intfloat-multilingual-e5-small-onnx.zip model.onnx tokenizer.json sentencepiece.bpe.model`
47
-
This will create a zip file with the name **intfloat-multilingual-e5-small-onnx.zip**
48
-
### c. Calculate hash
49
-
Now that you have the zip file you must now calculate its hash so that you can use it on model registration. RUn the following within
50
-
the directory that has the zip `shasum -a 256 intfloat-multilingual-e5-small-onnx.zip`.
51
-
### d. service the zip file using a python server
52
-
With the zip file and its hash we should service it so that OpenSearch can find it and download it. Since this is for a local development
53
-
we can simply host this locally using python. Navigate to the directory that has the zip file and run the following `python3 -m http.server 8080 --bind 0.0.0.0
54
-
` After step 4 you can cancel this server by executing ctrl + c.
55
-
56
-
57
-
## 3. Register a model group
58
-
We will create a model group to associate the model run the following and take note of the model group id.
36
+
This configuration ensures that OpenSearch can accept machine learning models from external URLs and can run models across non-ML nodes.
37
+
38
+
### b. Use a Docker Compose File
39
+
40
+
You can use this sample [file](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development) as an example.
41
+
Once your `docker-compose.yml` file is ready, run the following command to start OpenSearch in the background:
42
+
43
+
```
44
+
docker-compose up -d
45
+
```
46
+
47
+
---
48
+
49
+
## Step 2: Prepare the Model for OpenSearch
50
+
51
+
In this tutorial, we’ll use the Hugging Face model `intfloat/multilingual-e5-small`, which is capable of generating multilingual embeddings. Follow these steps to prepare and zip the model for use in OpenSearch.
52
+
53
+
### a. Clone the Model from Hugging Face
54
+
55
+
To download the model, use the following steps:
56
+
57
+
1. Install Git Large File Storage (LFS) if you haven’t already:
This will download the model files into a directory on your local machine.
70
+
71
+
### b. Zip the Model Files
72
+
73
+
In order to upload the model to OpenSearch, you must zip the necessary model files (`model.onnx`, `sentencepiece.bpe.model`, and `tokenizer.json`). The `model.onnx` file is located in the `onnx` directory of the cloned repository.
74
+
75
+
Run the following command in the directory containing these files:
76
+
77
+
```
78
+
zip -r intfloat-multilingual-e5-small-onnx.zip model.onnx tokenizer.json sentencepiece.bpe.model
79
+
```
80
+
81
+
This command will create a zip file named `intfloat-multilingual-e5-small-onnx.zip`.
82
+
83
+
### c. Calculate the Model File Hash
84
+
85
+
Before registering the model, you need to calculate the SHA-256 hash of the zip file. Run this command to generate the hash:
86
+
87
+
```
88
+
shasum -a 256 intfloat-multilingual-e5-small-onnx.zip
89
+
```
90
+
91
+
Make a note of the hash value, as you will need it during the model registration process.
92
+
93
+
### d. Serve the Model File Using a Python HTTP Server
94
+
95
+
To allow OpenSearch to access the model file, you need to serve it via HTTP. Since this is a local development environment, you can use Python's built-in HTTP server:
96
+
97
+
Navigate to the directory containing the zip file and run the following command:
98
+
99
+
```
100
+
python3 -m http.server 8080 --bind 0.0.0.0
101
+
```
102
+
103
+
This will serve the zip file at `http://0.0.0.0:8080/intfloat-multilingual-e5-small-onnx.zip`. After registering the model, you can stop the server by pressing `Ctrl + C`.
104
+
105
+
---
106
+
107
+
## Step 3: Register a Model Group
108
+
109
+
Before registering the model itself, you need to create a model group. This helps organize models in OpenSearch. Run the following request to create a new model group:
110
+
59
111
```
60
112
POST /_plugins/_ml/model_groups/_register
61
113
{
62
114
"name": "Asymmetric Model Group",
63
-
"description": "A model group for local assymetric models"
115
+
"description": "A model group for local asymmetric models"
64
116
}
65
117
```
66
118
67
-
## 4. Register the model
68
-
Now we can register the model which will retrieve the model from the python server since this is running within a docker
69
-
container you will have to use the url `http://host.docker.internal:8080/intfloat-multilingual-e5-small-onnx.zip`. When
70
-
running the command below make sure to take note of the model id returned by the OpenSearch call after calling the task API.
119
+
Take note of the `model_group_id` returned in the response, as it will be required when registering the model.
120
+
121
+
---
122
+
123
+
## Step 4: Register the Model
124
+
125
+
Now that you have the model zip file and the model group ID, you can register the model in OpenSearch. Run the following request:
71
126
72
127
```
73
128
POST /_plugins/_ml/models/_register
@@ -82,45 +137,67 @@ POST /_plugins/_ml/models/_register
This returns a task id you can check whether the registration succeeded by running
94
146
```
147
+
148
+
Replace `your_group_id` and `your_model_zip_content_hash_value` with the actual values from earlier. This will initiate the model registration process, and you’ll receive a task ID in the response.
149
+
150
+
To check the status of the registration, run:
151
+
152
+
```bash
95
153
GET /_plugins/_ml/tasks/your_task_id
96
154
```
97
-
After success make sure to take note of the model_id
98
155
99
-
## 5. Deploy The model
156
+
Once successful, note the `model_id` returned, as you'll need it for deployment and inference.
157
+
158
+
---
159
+
160
+
## Step 5: Deploy the Model
161
+
162
+
After the model is registered, you can deploy it by running:
100
163
101
-
After the registration is complete you can now deploy it
102
164
```
103
165
POST /_plugins/_ml/models/your_model_id/_deploy
104
166
```
105
-
Again this returns a task if run the aforementioned get task endpoint to check the status and after sometime
106
-
the model id is now in the **DEPLOYED** state.
107
167
108
-
## 6. Run Inference
109
-
Wit the model now deployed you can run inference by seeing embeddings. In this secnario you can specify two types of embeddings
110
-
one for passages and one for queries.
168
+
Check the status of the deployment using the task ID:
169
+
170
+
```
171
+
GET /_plugins/_ml/tasks/your_task_id
172
+
```
173
+
174
+
When the model is successfully deployed, it will be in the **DEPLOYED** state, and you can use it for inference.
175
+
176
+
---
177
+
178
+
## Step 6: Run Inference
179
+
180
+
Now that your model is deployed, you can use it to generate text embeddings for both queries and passages.
181
+
182
+
### a. Generating Passage Embeddings
183
+
184
+
To generate embeddings for a passage, use the following request:
111
185
112
-
For example for embedding a passage you can run this
113
186
```
114
187
POST /_plugins/_ml/_predict/text_embedding/your_model_id
115
-
{
116
-
"parameters": {
117
-
"content_type": "passage"
188
+
{
189
+
"parameters": {
190
+
"content_type": "passage"
118
191
},
119
-
"text_docs":[ "Today is Friday, tomorrow will be my break day, After that I will go to the library, when is lunch?"],
192
+
"text_docs": [
193
+
"Today is Friday, tomorrow will be my break day. After that, I will go to the library. When is lunch?"
194
+
],
120
195
"target_response": ["sentence_embedding"]
121
196
}
122
197
```
123
-
you should see a similar embedding of size 384.
198
+
199
+
The response will include a sentence embedding of size 384:
200
+
124
201
```json
125
202
{
126
203
"inference_results": [
@@ -129,36 +206,32 @@ you should see a similar embedding of size 384.
0 commit comments