Skip to content

Commit 8cd8359

Browse files
committed
expand more details on tutorial
Expands on the context of the previous commit and improvements in grammar and structure Signed-off-by: Brian Flores <[email protected]>
1 parent bcdd180 commit 8cd8359

File tree

1 file changed

+168
-97
lines changed

1 file changed

+168
-97
lines changed

docs/tutorials/semantic_search/asymmetric_embedding_model.md

Lines changed: 168 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,26 @@
1-
# Topic
1+
# Tutorial: Generating Embeddings Using a Local Asymmetric Embedding Model in OpenSearch
22

3-
This tutorial shows how to generate embeddings using a local asymmetric embedding model in OpenSearch implemented in a Docker container .
3+
This tutorial demonstrates how to generate text embeddings using an asymmetric embedding model in OpenSearch, implemented within a Docker container. The example model used in this tutorial is the multilingual `intfloat/multilingual-e5-small` model from Hugging Face. You will learn how to prepare the model, register it in OpenSearch, and run inference to generate embeddings.
44

5-
Note: Replace the placeholders that start with `your_` with your own values.
5+
> **Note**: Make sure to replace all placeholders (e.g., `your_`) with your specific values.
66
7-
# Steps
7+
---
88

9-
## 1. Spin up a docker OpenSearch Cluster
9+
## Prerequisites
1010

11-
With docker you can create a multi node cluster follow this docker-compose file as an example https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development
12-
.Make sure you have docker desktop installed
11+
- Docker Desktop installed and running on your local machine.
12+
- Basic familiarity with Docker and OpenSearch.
13+
- Access to the Hugging Face model `intfloat/multilingual-e5-small` (or another model of your choice).
14+
---
15+
16+
## Step 1: Spin Up a Docker OpenSearch Cluster
17+
18+
To run OpenSearch in a local development environment, you can use Docker and a pre-configured `docker-compose` file.
19+
20+
### a. Update Cluster Settings
21+
22+
Before proceeding, ensure your cluster is configured to allow registering models. You can do this by updating the cluster settings via the following request:
1323

14-
### a. Update cluster settings
15-
The current step uses a docker-compose file that uses two opensearch Non-ML nodes. This requires you to update cluster settings
16-
so you can run a model.
1724
```
1825
PUT _cluster/settings
1926
{
@@ -26,48 +33,96 @@ PUT _cluster/settings
2633
}
2734
```
2835

29-
### b. Use a docker compose file
30-
31-
Now that you have the docker-compose file you can use it by having docker dektop running in the background and then running
32-
the command (within the same directory of the compose file) `docker-compose up -d` which will start opensearch in the background.
33-
34-
## 2. Prepare the model for OpenSearch
35-
In this tutorial you will use a Hugging Face intfloat/multilingual-e5-small model (https://huggingface.co/intfloat/multilingual-e5-small) an asymmetric
36-
text embedding model capable of handling different languages.
37-
38-
### a. Clone the model
39-
You can find the steps within the models homepage click on the three dots just left of the train button. Then click **clone repository**
40-
for this specific tutorial you will have to execture teh following. Making sure you find a approriate place to host the model.
41-
1. `git lfs install`
42-
2. `git clone https://huggingface.co/intfloat/multilingual-e5-small`
43-
### b. Zip the contents
44-
In order to send the OpenSearch the embedding model make sure to zip the model contents more specifically you will need to zip the following
45-
items in the directory that has the items `model.onnx, sentencepiece.bpe.model, tokenizer.json`. The **model.onnx** file is found within the
46-
onnx directory of the repository you cloned. Now that you have the contents run the following in the relevant directory `zip -r intfloat-multilingual-e5-small-onnx.zip model.onnx tokenizer.json sentencepiece.bpe.model`
47-
This will create a zip file with the name **intfloat-multilingual-e5-small-onnx.zip**
48-
### c. Calculate hash
49-
Now that you have the zip file you must now calculate its hash so that you can use it on model registration. RUn the following within
50-
the directory that has the zip `shasum -a 256 intfloat-multilingual-e5-small-onnx.zip`.
51-
### d. service the zip file using a python server
52-
With the zip file and its hash we should service it so that OpenSearch can find it and download it. Since this is for a local development
53-
we can simply host this locally using python. Navigate to the directory that has the zip file and run the following `python3 -m http.server 8080 --bind 0.0.0.0
54-
` After step 4 you can cancel this server by executing ctrl + c.
55-
56-
57-
## 3. Register a model group
58-
We will create a model group to associate the model run the following and take note of the model group id.
36+
This configuration ensures that OpenSearch can accept machine learning models from external URLs and can run models across non-ML nodes.
37+
38+
### b. Use a Docker Compose File
39+
40+
You can use this sample [file](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development) as an example.
41+
Once your `docker-compose.yml` file is ready, run the following command to start OpenSearch in the background:
42+
43+
```
44+
docker-compose up -d
45+
```
46+
47+
---
48+
49+
## Step 2: Prepare the Model for OpenSearch
50+
51+
In this tutorial, we’ll use the Hugging Face model `intfloat/multilingual-e5-small`, which is capable of generating multilingual embeddings. Follow these steps to prepare and zip the model for use in OpenSearch.
52+
53+
### a. Clone the Model from Hugging Face
54+
55+
To download the model, use the following steps:
56+
57+
1. Install Git Large File Storage (LFS) if you haven’t already:
58+
59+
```
60+
git lfs install
61+
```
62+
63+
2. Clone the model repository:
64+
65+
```
66+
git clone https://huggingface.co/intfloat/multilingual-e5-small
67+
```
68+
69+
This will download the model files into a directory on your local machine.
70+
71+
### b. Zip the Model Files
72+
73+
In order to upload the model to OpenSearch, you must zip the necessary model files (`model.onnx`, `sentencepiece.bpe.model`, and `tokenizer.json`). The `model.onnx` file is located in the `onnx` directory of the cloned repository.
74+
75+
Run the following command in the directory containing these files:
76+
77+
```
78+
zip -r intfloat-multilingual-e5-small-onnx.zip model.onnx tokenizer.json sentencepiece.bpe.model
79+
```
80+
81+
This command will create a zip file named `intfloat-multilingual-e5-small-onnx.zip`.
82+
83+
### c. Calculate the Model File Hash
84+
85+
Before registering the model, you need to calculate the SHA-256 hash of the zip file. Run this command to generate the hash:
86+
87+
```
88+
shasum -a 256 intfloat-multilingual-e5-small-onnx.zip
89+
```
90+
91+
Make a note of the hash value, as you will need it during the model registration process.
92+
93+
### d. Serve the Model File Using a Python HTTP Server
94+
95+
To allow OpenSearch to access the model file, you need to serve it via HTTP. Since this is a local development environment, you can use Python's built-in HTTP server:
96+
97+
Navigate to the directory containing the zip file and run the following command:
98+
99+
```
100+
python3 -m http.server 8080 --bind 0.0.0.0
101+
```
102+
103+
This will serve the zip file at `http://0.0.0.0:8080/intfloat-multilingual-e5-small-onnx.zip`. After registering the model, you can stop the server by pressing `Ctrl + C`.
104+
105+
---
106+
107+
## Step 3: Register a Model Group
108+
109+
Before registering the model itself, you need to create a model group. This helps organize models in OpenSearch. Run the following request to create a new model group:
110+
59111
```
60112
POST /_plugins/_ml/model_groups/_register
61113
{
62114
"name": "Asymmetric Model Group",
63-
"description": "A model group for local assymetric models"
115+
"description": "A model group for local asymmetric models"
64116
}
65117
```
66118

67-
## 4. Register the model
68-
Now we can register the model which will retrieve the model from the python server since this is running within a docker
69-
container you will have to use the url `http://host.docker.internal:8080/intfloat-multilingual-e5-small-onnx.zip`. When
70-
running the command below make sure to take note of the model id returned by the OpenSearch call after calling the task API.
119+
Take note of the `model_group_id` returned in the response, as it will be required when registering the model.
120+
121+
---
122+
123+
## Step 4: Register the Model
124+
125+
Now that you have the model zip file and the model group ID, you can register the model in OpenSearch. Run the following request:
71126

72127
```
73128
POST /_plugins/_ml/models/_register
@@ -82,45 +137,67 @@ POST /_plugins/_ml/models/_register
82137
"model_type": "bert",
83138
"embedding_dimension": 384,
84139
"framework_type": "sentence_transformers",
85-
"query_prefix" : "query: ",
86-
"passage_prefix" : "passage: ",
87-
"all_config" : "{ \"_name_or_path\": \"intfloat/multilingual-e5-small\", \"architectures\": [ \"BertModel\" ], \"attention_probs_dropout_prob\": 0.1, \"classifier_dropout\": null, \"hidden_act\": \"gelu\", \"hidden_dropout_prob\": 0.1, \"hidden_size\": 384, \"initializer_range\": 0.02, \"intermediate_size\": 1536, \"layer_norm_eps\": 1e-12, \"max_position_embeddings\": 512, \"model_type\": \"bert\", \"num_attention_heads\": 12, \"num_hidden_layers\": 12, \"pad_token_id\": 0, \"position_embedding_type\": \"absolute\", \"tokenizer_class\": \"XLMRobertaTokenizer\", \"transformers_version\": \"4.30.2\", \"type_vocab_size\": 2, \"use_cache\": true, \"vocab_size\": 250037}"
140+
"query_prefix": "query: ",
141+
"passage_prefix": "passage: ",
142+
"all_config": "{ \"_name_or_path\": \"intfloat/multilingual-e5-small\", \"architectures\": [ \"BertModel\" ], \"attention_probs_dropout_prob\": 0.1, \"hidden_size\": 384, \"num_attention_heads\": 12, \"num_hidden_layers\": 12, \"tokenizer_class\": \"XLMRobertaTokenizer\" }"
88143
},
89144
"url": "http://host.docker.internal:8080/intfloat-multilingual-e5-small-onnx.zip"
90145
}
91-
92-
```
93-
This returns a task id you can check whether the registration succeeded by running
94146
```
147+
148+
Replace `your_group_id` and `your_model_zip_content_hash_value` with the actual values from earlier. This will initiate the model registration process, and you’ll receive a task ID in the response.
149+
150+
To check the status of the registration, run:
151+
152+
```bash
95153
GET /_plugins/_ml/tasks/your_task_id
96154
```
97-
After success make sure to take note of the model_id
98155

99-
## 5. Deploy The model
156+
Once successful, note the `model_id` returned, as you'll need it for deployment and inference.
157+
158+
---
159+
160+
## Step 5: Deploy the Model
161+
162+
After the model is registered, you can deploy it by running:
100163

101-
After the registration is complete you can now deploy it
102164
```
103165
POST /_plugins/_ml/models/your_model_id/_deploy
104166
```
105-
Again this returns a task if run the aforementioned get task endpoint to check the status and after sometime
106-
the model id is now in the **DEPLOYED** state.
107167

108-
## 6. Run Inference
109-
Wit the model now deployed you can run inference by seeing embeddings. In this secnario you can specify two types of embeddings
110-
one for passages and one for queries.
168+
Check the status of the deployment using the task ID:
169+
170+
```
171+
GET /_plugins/_ml/tasks/your_task_id
172+
```
173+
174+
When the model is successfully deployed, it will be in the **DEPLOYED** state, and you can use it for inference.
175+
176+
---
177+
178+
## Step 6: Run Inference
179+
180+
Now that your model is deployed, you can use it to generate text embeddings for both queries and passages.
181+
182+
### a. Generating Passage Embeddings
183+
184+
To generate embeddings for a passage, use the following request:
111185

112-
For example for embedding a passage you can run this
113186
```
114187
POST /_plugins/_ml/_predict/text_embedding/your_model_id
115-
{
116-
"parameters" : {
117-
"content_type" : "passage"
188+
{
189+
"parameters": {
190+
"content_type": "passage"
118191
},
119-
"text_docs":[ "Today is Friday, tomorrow will be my break day, After that I will go to the library, when is lunch?"],
192+
"text_docs": [
193+
"Today is Friday, tomorrow will be my break day. After that, I will go to the library. When is lunch?"
194+
],
120195
"target_response": ["sentence_embedding"]
121196
}
122197
```
123-
you should see a similar embedding of size 384.
198+
199+
The response will include a sentence embedding of size 384:
200+
124201
```json
125202
{
126203
"inference_results": [
@@ -129,36 +206,32 @@ you should see a similar embedding of size 384.
129206
{
130207
"name": "sentence_embedding",
131208
"data_type": "FLOAT32",
132-
"shape": [
133-
384
134-
],
135-
"data": [
136-
0.0419328,
137-
0.047480892,
138-
...
139-
0.31158513,
140-
0.21784715,
141-
0.29523832
142-
]
209+
"shape": [384],
210+
"data": [0.0419328, 0.047480892, ..., 0.31158513, 0.21784715]
143211
}
144212
]
145213
}
146214
]
147215
}
148216
```
149217

150-
Here is an example of a query embedding.
218+
### b. Generating Query Embeddings
219+
220+
Similarly, you can generate embeddings for a query:
221+
151222
```
152223
POST /_plugins/_ml/_predict/text_embedding/your_model_id
153-
{
154-
"parameters" : {
155-
"content_type" : "query"
224+
{
225+
"parameters": {
226+
"content_type": "query"
156227
},
157228
"text_docs": ["What day is it today?"],
158229
"target_response": ["sentence_embedding"]
159230
}
160231
```
161-
which gives back a result
232+
233+
The response will look like this:
234+
162235
```json
163236
{
164237
"inference_results": [
@@ -167,29 +240,27 @@ which gives back a result
167240
{
168241
"name": "sentence_embedding",
169242
"data_type": "FLOAT32",
170-
"shape": [
171-
384
172-
],
173-
"data": [
174-
0.2338349,
175-
-0.13603798,
176-
...
177-
0.37335885,
178-
0.10653384,
179-
0.21653183
180-
]
243+
"shape": [384],
244+
"data": [0.2338349, -0.13603798, ..., 0.37335885, 0.10653384]
181245
}
182246
]
183247
}
184248
]
185249
}
186250
```
187-
## Next steps
188251

189-
- Create an ingest pipeline for your documents with assymetric embeddings
190-
- Run a query using KNN with your asymmetric model
252+
---
253+
254+
## Next Steps
255+
256+
- Create an ingest pipeline for processing documents using asymmetric embeddings.
257+
- Run a query using KNN (k-nearest neighbors) to search with your asymmetric model.
258+
259+
---
260+
261+
## References
191262

263+
- Wang, Liang, et al. (2024). *Multilingual E5 Text Embeddings: A Technical Report*. arXiv preprint arXiv:2402.05672. [Link](https://arxiv.org/abs/2402.05672)
192264

193-
# References
265+
---
194266

195-
Wang, Liang, et al. (2024). *Multilingual E5 Text Embeddings: A Technical Report*. arXiv preprint arXiv:2402.05672. [Link](https://arxiv.org/abs/2402.05672)

0 commit comments

Comments
 (0)