You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/semantic_search/asymmetric_embedding_model.md
+52-50Lines changed: 52 additions & 50 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
# Tutorial: Running Asymmetric Semnantic Search within OpenSearch
2
2
3
-
This tutorial demonstrates how to generate text embeddings using an asymmetric embedding model in OpenSearch which will be used
4
-
to run semantic search. This is implemented within a Docker container, the example model used in this tutorial is the multilingual
3
+
This tutorial demonstrates how generating text embeddings using an asymmetric embedding model in OpenSearch. The embeddings will be used
4
+
to run semantic search, implemented using a Docker container. The example model used in this tutorial is the multilingual
5
5
`intfloat/multilingual-e5-small` model from Hugging Face.
6
6
You will learn how to prepare the model, register it in OpenSearch, and run inference to generate embeddings.
7
7
@@ -13,16 +13,26 @@ You will learn how to prepare the model, register it in OpenSearch, and run infe
13
13
14
14
- Docker Desktop installed and running on your local machine.
15
15
- Basic familiarity with Docker and OpenSearch.
16
-
- Access to the Hugging Face model `intfloat/multilingual-e5-small` (or another model of your choice).
16
+
- Access to the Hugging Face `intfloat/multilingual-e5-small` model (or another model of your choice).
17
17
---
18
18
19
-
## Step 1: Spin Up a Docker OpenSearch Cluster
19
+
## Step 1: Spin up a Docker OpenSearch cluster
20
20
21
-
To run OpenSearch in a local development environment, you can use Docker and a pre-configured`docker-compose` file.
21
+
To run OpenSearch in a local development environment, you can use Docker and a preconfigured`docker-compose` file.
22
22
23
-
### a. Update Cluster Settings
23
+
### a. Create a Docker Compose File
24
24
25
-
Before proceeding, ensure your cluster is configured to allow registering models. You can do this by updating the cluster settings via the following request:
25
+
You can use this sample [file](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development) as an example.
26
+
Once your `docker-compose.yml` file is created, run the following command to start OpenSearch in the background:
27
+
28
+
```
29
+
docker-compose up -d
30
+
```
31
+
32
+
33
+
### b. Update cluster settings
34
+
35
+
Ensure your cluster is configured to allow registering models. You can do this by updating the cluster settings using the following request:
26
36
27
37
```
28
38
PUT _cluster/settings
@@ -38,22 +48,14 @@ PUT _cluster/settings
38
48
39
49
This configuration ensures that OpenSearch can accept machine learning models from external URLs and can run models across non-ML nodes.
40
50
41
-
### b. Use a Docker Compose File
42
-
43
-
You can use this sample [file](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development) as an example.
44
-
Once your `docker-compose.yml` file is ready, run the following command to start OpenSearch in the background:
45
-
46
-
```
47
-
docker-compose up -d
48
-
```
49
51
50
52
---
51
53
52
-
## Step 2: Prepare the Model for OpenSearch
54
+
## Step 2: Prepare the model for OpenSearch
53
55
54
-
In this tutorial, we’ll use the Hugging Face model `intfloat/multilingual-e5-small`, which is capable of generating multilingual embeddings. Follow these steps to prepare and zip the model for use in OpenSearch.
56
+
In this tutorial, you’ll use the Hugging Face `intfloat/multilingual-e5-small` model, which is capable of generating multilingual embeddings. Follow these steps to prepare and zip the model for use in OpenSearch.
55
57
56
-
### a. Clone the Model from Hugging Face
58
+
### a. Clone the model from Hugging Face
57
59
58
60
To download the model, use the following steps:
59
61
@@ -71,31 +73,31 @@ To download the model, use the following steps:
71
73
72
74
This will download the model files into a directory on your local machine.
73
75
74
-
### b. Zip the Model Files
76
+
### b. Zip the model files
75
77
76
-
In order to upload the model to OpenSearch, you must zip the necessary model files (`model.onnx`, `sentencepiece.bpe.model`, and `tokenizer.json`). The `model.onnx` file is located in the `onnx` directory of the cloned repository.
78
+
To upload the model to OpenSearch, you must zip the necessary model files (`model.onnx`, `sentencepiece.bpe.model`, and `tokenizer.json`). The `model.onnx` file is located in the `onnx` directory of the cloned repository.
77
79
78
80
Run the following command in the directory containing these files:
79
81
80
82
```
81
83
zip -r intfloat-multilingual-e5-small-onnx.zip model.onnx tokenizer.json sentencepiece.bpe.model
82
84
```
83
85
84
-
This command will create a zip file named `intfloat-multilingual-e5-small-onnx.zip`, with the previous mentioned files.
86
+
This command will create a zip file named `intfloat-multilingual-e5-small-onnx.zip`, with the all necessary files.
85
87
86
-
### c. Calculate the Model File Hash
88
+
### c. Calculate the model file's hash
87
89
88
90
Before registering the model, you need to calculate the SHA-256 hash of the zip file. Run this command to generate the hash:
89
91
90
92
```
91
93
shasum -a 256 intfloat-multilingual-e5-small-onnx.zip
92
94
```
93
95
94
-
Make a note of the hash value, as you will need it during the model registration process.
96
+
Note the hash value; You'll need it during the model registration.
95
97
96
-
### d. Serve the Model File Using a Python HTTP Server
98
+
### d. Serve the model file using a Python HTTP server
97
99
98
-
To allow OpenSearch to access the model file, you need to serve it via HTTP. Since this is a local development environment, you can use Python's built-in HTTP server:
100
+
To allow OpenSearch to access the model file, you can serve it through HTTP. Because this tutorial uses a local development environment, you can use Python's built-in HTTP server command:
99
101
100
102
Navigate to the directory containing the zip file and run the following command:
101
103
@@ -107,7 +109,7 @@ This will serve the zip file at `http://0.0.0.0:8080/intfloat-multilingual-e5-sm
107
109
108
110
---
109
111
110
-
## Step 3: Register a Model Group
112
+
## Step 3: Register a model group
111
113
112
114
Before registering the model itself, you need to create a model group. This helps organize models in OpenSearch. Run the following request to create a new model group:
113
115
@@ -119,13 +121,13 @@ POST /_plugins/_ml/model_groups/_register
119
121
}
120
122
```
121
123
122
-
Take note of the `model_group_id` returned in the response, as it will be required when registering the model.
124
+
Note of the `model_group_id` returned in the response; you'll use it to register the model.
123
125
124
126
---
125
127
126
-
## Step 4: Register the Model
128
+
## Step 4: Register the model
127
129
128
-
Now that you have the model zip file and the model group ID, you can register the model in OpenSearch. Run the following request:
130
+
Now that you have the model zip file and the model group ID, you can register the model in OpenSearch:
129
131
130
132
```
131
133
POST /_plugins/_ml/models/_register
@@ -148,21 +150,21 @@ POST /_plugins/_ml/models/_register
148
150
}
149
151
```
150
152
151
-
Replace `your_group_id` and `your_model_zip_content_hash_value` with the actual values from earlier. This will initiate the model registration process, and you’ll receive a task ID in the response.
153
+
Replace `your_group_id` and `your_model_zip_content_hash_value` with the values from previous steps. This will initiate the model registration process, and you’ll receive a task ID in the response.
152
154
153
-
To check the status of the registration, run:
155
+
To check the status of the registration, run the following request:
154
156
155
157
```
156
158
GET /_plugins/_ml/tasks/your_task_id
157
159
```
158
160
159
-
Once successful, note the `model_id` returned, as you'll need it for deployment and inference.
161
+
Once successful, note the `model_id` returned; you'll need it for deployment and inference.
160
162
161
163
---
162
164
163
-
## Step 5: Deploy the Model
165
+
## Step 5: Deploy the model
164
166
165
-
After the model is registered, you can deploy it by running:
167
+
After the model is registered, you can deploy it by running the following request:
166
168
167
169
```
168
170
POST /_plugins/_ml/models/your_model_id/_deploy
@@ -174,15 +176,15 @@ Check the status of the deployment using the task ID:
174
176
GET /_plugins/_ml/tasks/your_task_id
175
177
```
176
178
177
-
When the model is successfully deployed, itwill be in the **DEPLOYED** state, and you can use it for inference.
179
+
When the model is successfully deployed, it's state will change to the **DEPLOYED** state, and you can use it for inference.
178
180
179
181
---
180
182
181
-
## Step 6: Run Inference
183
+
## Step 6: Run inference
182
184
183
185
Now that your model is deployed, you can use it to generate text embeddings for both queries and passages.
184
186
185
-
### a. Generating Passage Embeddings
187
+
### Generating passage embeddings
186
188
187
189
To generate embeddings for a passage, use the following request:
188
190
@@ -218,7 +220,7 @@ The response will include a sentence embedding of size 384:
218
220
}
219
221
```
220
222
221
-
### b. Generating Query Embeddings
223
+
### Generating query embeddings
222
224
223
225
Similarly, you can generate embeddings for a query:
224
226
@@ -254,16 +256,16 @@ The response will look like this:
254
256
255
257
---
256
258
257
-
# Applying Semantic Search using an ML Inference processor
259
+
# Applying semantic search using an ML Inference processor
258
260
259
-
In this section you are going to apply semantic search on facts about New York City. First you will create an ingest pipeline
261
+
In this section you'll run semantic search on facts about New York City. First, you'll create an ingest pipeline
260
262
using the ML inference processor to create embeddings on ingestion. Then create a search pipeline to run a search using
261
263
the same asymmetric embedding model.
262
264
263
265
264
266
## 2. Create an ingest pipeline
265
267
266
-
### 2.1 Create the test KNN index
268
+
### 2.1 Create a test KNN index
267
269
```
268
270
PUT nyc_facts
269
271
{
@@ -324,9 +326,9 @@ PUT _ingest/pipeline/asymmetric_embedding_ingest_pipeline
324
326
}
325
327
```
326
328
327
-
### 2.3 Simulate pipeline
329
+
### 2.3 Simulate the pipeline
328
330
329
-
- Case1: two book objects with title
331
+
Simulate the pipeline by running the following request:
330
332
```
331
333
POST /_ingest/pipeline/asymmetric_embedding_ingest_pipeline/_simulate
332
334
{
@@ -342,7 +344,7 @@ POST /_ingest/pipeline/asymmetric_embedding_ingest_pipeline/_simulate
342
344
]
343
345
}
344
346
```
345
-
Response
347
+
The response contains the embedding generated by the model after we "ingested" a document using the pipeline.
346
348
```
347
349
{
348
350
"docs": [
@@ -375,8 +377,8 @@ Response
375
377
}
376
378
```
377
379
378
-
### 2.4 Test ingest data
379
-
Perform bulk ingestion, this will now trigger the ingest pipeline to have embeddings for each document.
380
+
### 2.4 Test data ingestion
381
+
When you perform bulk ingestion, the ingest pipeline will generate embeddings for each document:
380
382
```
381
383
POST /_bulk
382
384
{ "index": { "_index": "nyc_facts" } }
@@ -414,8 +416,8 @@ POST /_bulk
414
416
415
417
## 3. Run Semantic Search
416
418
417
-
### 3.1 Create the Search Pipeline
418
-
Create the search pipeline which will convert your query into a embedding and run KNN on the index to return the best documents.
419
+
### 3.1 Create a Search Pipeline
420
+
Create a search pipeline that will convert your query into an embedding and run K-NN search on the index to return the best-matching documents:
419
421
420
422
```
421
423
PUT /_search/pipeline/asymmetric_embedding_search_pipeline
@@ -447,7 +449,7 @@ PUT /_search/pipeline/asymmetric_embedding_search_pipeline
447
449
```
448
450
449
451
### 3.1 Run Semantic Search
450
-
In this scenario we are going to see the top 3 results, when asking about sporting activities in New York City.
452
+
Run a query about sporting activities in New York City:
451
453
```
452
454
GET /nyc_facts/_search?search_pipeline=asymmetric_embedding_search_pipeline
453
455
{
@@ -462,7 +464,7 @@ GET /nyc_facts/_search?search_pipeline=asymmetric_embedding_search_pipeline
462
464
}
463
465
```
464
466
465
-
Which yields the following
467
+
The response contains the top 3 matching documents:
0 commit comments