Skip to content

Commit bcdd180

Browse files
committed
Add: initial thoughts to highlevel steps
Provides more context to each step Signed-off-by: Brian Flores <[email protected]>
1 parent f605636 commit bcdd180

File tree

1 file changed

+166
-4
lines changed

1 file changed

+166
-4
lines changed

docs/tutorials/semantic_search/asymmetric_embedding_model.md

Lines changed: 166 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,23 +5,185 @@ This tutorial shows how to generate embeddings using a local asymmetric embeddin
55
Note: Replace the placeholders that start with `your_` with your own values.
66

77
# Steps
8+
89
## 1. Spin up a docker OpenSearch Cluster
910

10-
### a. Use a docker compose file
11+
With docker you can create a multi node cluster follow this docker-compose file as an example https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development
12+
.Make sure you have docker desktop installed
1113

12-
## 2. Prepare the model for OpenSearch
14+
### a. Update cluster settings
15+
The current step uses a docker-compose file that uses two opensearch Non-ML nodes. This requires you to update cluster settings
16+
so you can run a model.
17+
```
18+
PUT _cluster/settings
19+
{
20+
"persistent": {
21+
"plugins.ml_commons.allow_registering_model_via_url": "true",
22+
"plugins.ml_commons.only_run_on_ml_node": "false",
23+
"plugins.ml_commons.model_access_control_enabled": "true",
24+
"plugins.ml_commons.native_memory_threshold": "99"
25+
}
26+
}
27+
```
28+
29+
### b. Use a docker compose file
30+
31+
Now that you have the docker-compose file you can use it by having docker dektop running in the background and then running
32+
the command (within the same directory of the compose file) `docker-compose up -d` which will start opensearch in the background.
1333

14-
### a. Clone the model
34+
## 2. Prepare the model for OpenSearch
35+
In this tutorial you will use a Hugging Face intfloat/multilingual-e5-small model (https://huggingface.co/intfloat/multilingual-e5-small) an asymmetric
36+
text embedding model capable of handling different languages.
37+
38+
### a. Clone the model
39+
You can find the steps within the models homepage click on the three dots just left of the train button. Then click **clone repository**
40+
for this specific tutorial you will have to execture teh following. Making sure you find a approriate place to host the model.
41+
1. `git lfs install`
42+
2. `git clone https://huggingface.co/intfloat/multilingual-e5-small`
1543
### b. Zip the contents
44+
In order to send the OpenSearch the embedding model make sure to zip the model contents more specifically you will need to zip the following
45+
items in the directory that has the items `model.onnx, sentencepiece.bpe.model, tokenizer.json`. The **model.onnx** file is found within the
46+
onnx directory of the repository you cloned. Now that you have the contents run the following in the relevant directory `zip -r intfloat-multilingual-e5-small-onnx.zip model.onnx tokenizer.json sentencepiece.bpe.model`
47+
This will create a zip file with the name **intfloat-multilingual-e5-small-onnx.zip**
1648
### c. Calculate hash
49+
Now that you have the zip file you must now calculate its hash so that you can use it on model registration. RUn the following within
50+
the directory that has the zip `shasum -a 256 intfloat-multilingual-e5-small-onnx.zip`.
1751
### d. service the zip file using a python server
52+
With the zip file and its hash we should service it so that OpenSearch can find it and download it. Since this is for a local development
53+
we can simply host this locally using python. Navigate to the directory that has the zip file and run the following `python3 -m http.server 8080 --bind 0.0.0.0
54+
` After step 4 you can cancel this server by executing ctrl + c.
1855

19-
- can cancel the server now
2056

2157
## 3. Register a model group
58+
We will create a model group to associate the model run the following and take note of the model group id.
59+
```
60+
POST /_plugins/_ml/model_groups/_register
61+
{
62+
"name": "Asymmetric Model Group",
63+
"description": "A model group for local assymetric models"
64+
}
65+
```
66+
2267
## 4. Register the model
68+
Now we can register the model which will retrieve the model from the python server since this is running within a docker
69+
container you will have to use the url `http://host.docker.internal:8080/intfloat-multilingual-e5-small-onnx.zip`. When
70+
running the command below make sure to take note of the model id returned by the OpenSearch call after calling the task API.
71+
72+
```
73+
POST /_plugins/_ml/models/_register
74+
{
75+
"name": "e5-small-onnx",
76+
"version": "1.0.0",
77+
"description": "Asymmetric multilingual-e5-small model",
78+
"model_format": "ONNX",
79+
"model_group_id": "your_group_id",
80+
"model_content_hash_value": "your_model_zip_content_hash_value",
81+
"model_config": {
82+
"model_type": "bert",
83+
"embedding_dimension": 384,
84+
"framework_type": "sentence_transformers",
85+
"query_prefix" : "query: ",
86+
"passage_prefix" : "passage: ",
87+
"all_config" : "{ \"_name_or_path\": \"intfloat/multilingual-e5-small\", \"architectures\": [ \"BertModel\" ], \"attention_probs_dropout_prob\": 0.1, \"classifier_dropout\": null, \"hidden_act\": \"gelu\", \"hidden_dropout_prob\": 0.1, \"hidden_size\": 384, \"initializer_range\": 0.02, \"intermediate_size\": 1536, \"layer_norm_eps\": 1e-12, \"max_position_embeddings\": 512, \"model_type\": \"bert\", \"num_attention_heads\": 12, \"num_hidden_layers\": 12, \"pad_token_id\": 0, \"position_embedding_type\": \"absolute\", \"tokenizer_class\": \"XLMRobertaTokenizer\", \"transformers_version\": \"4.30.2\", \"type_vocab_size\": 2, \"use_cache\": true, \"vocab_size\": 250037}"
88+
},
89+
"url": "http://host.docker.internal:8080/intfloat-multilingual-e5-small-onnx.zip"
90+
}
91+
92+
```
93+
This returns a task id you can check whether the registration succeeded by running
94+
```
95+
GET /_plugins/_ml/tasks/your_task_id
96+
```
97+
After success make sure to take note of the model_id
98+
2399
## 5. Deploy The model
100+
101+
After the registration is complete you can now deploy it
102+
```
103+
POST /_plugins/_ml/models/your_model_id/_deploy
104+
```
105+
Again this returns a task if run the aforementioned get task endpoint to check the status and after sometime
106+
the model id is now in the **DEPLOYED** state.
107+
24108
## 6. Run Inference
109+
Wit the model now deployed you can run inference by seeing embeddings. In this secnario you can specify two types of embeddings
110+
one for passages and one for queries.
111+
112+
For example for embedding a passage you can run this
113+
```
114+
POST /_plugins/_ml/_predict/text_embedding/your_model_id
115+
{
116+
"parameters" : {
117+
"content_type" : "passage"
118+
},
119+
"text_docs":[ "Today is Friday, tomorrow will be my break day, After that I will go to the library, when is lunch?"],
120+
"target_response": ["sentence_embedding"]
121+
}
122+
```
123+
you should see a similar embedding of size 384.
124+
```json
125+
{
126+
"inference_results": [
127+
{
128+
"output": [
129+
{
130+
"name": "sentence_embedding",
131+
"data_type": "FLOAT32",
132+
"shape": [
133+
384
134+
],
135+
"data": [
136+
0.0419328,
137+
0.047480892,
138+
...
139+
0.31158513,
140+
0.21784715,
141+
0.29523832
142+
]
143+
}
144+
]
145+
}
146+
]
147+
}
148+
```
149+
150+
Here is an example of a query embedding.
151+
```
152+
POST /_plugins/_ml/_predict/text_embedding/your_model_id
153+
{
154+
"parameters" : {
155+
"content_type" : "query"
156+
},
157+
"text_docs": ["What day is it today?"],
158+
"target_response": ["sentence_embedding"]
159+
}
160+
```
161+
which gives back a result
162+
```json
163+
{
164+
"inference_results": [
165+
{
166+
"output": [
167+
{
168+
"name": "sentence_embedding",
169+
"data_type": "FLOAT32",
170+
"shape": [
171+
384
172+
],
173+
"data": [
174+
0.2338349,
175+
-0.13603798,
176+
...
177+
0.37335885,
178+
0.10653384,
179+
0.21653183
180+
]
181+
}
182+
]
183+
}
184+
]
185+
}
186+
```
25187
## Next steps
26188

27189
- Create an ingest pipeline for your documents with assymetric embeddings

0 commit comments

Comments
 (0)