|
1 | 1 | # Tutorial: Running Asymmetric Semnantic Search within OpenSearch |
2 | 2 |
|
3 | | -This tutorial demonstrates how generating text embeddings using an asymmetric embedding model in OpenSearch. The embeddings will be used |
4 | | -to run semantic search, implemented using a Docker container. The example model used in this tutorial is the multilingual |
5 | | -`intfloat/multilingual-e5-small` model from Hugging Face. |
6 | | -You will learn how to prepare the model, register it in OpenSearch, and run inference to generate embeddings. |
| 3 | +This tutorial demonstrates how generating text embeddings using an asymmetric embedding model in OpenSearch. The example model used in this tutorial is the multilingual |
| 4 | +`intfloat/multilingual-e5-small` model from Hugging Face. |
| 5 | +In this tutorial, you'll learn how to prepare the model, register it in OpenSearch, and run inference to generate embeddings. |
7 | 6 |
|
8 | 7 | > **Note**: Make sure to replace all placeholders (e.g., `your_`) with your specific values. |
9 | 8 |
|
10 | 9 | --- |
11 | 10 |
|
12 | 11 | ## Prerequisites |
13 | 12 |
|
14 | | -- Docker Desktop installed and running on your local machine. |
15 | | -- Basic familiarity with Docker and OpenSearch. |
| 13 | +- OpenSearch installed on your machine |
16 | 14 | - Access to the Hugging Face `intfloat/multilingual-e5-small` model (or another model of your choice). |
| 15 | +- Basic knowledge of Linux commands |
17 | 16 | --- |
18 | 17 |
|
19 | | -## Step 1: Spin up a Docker OpenSearch cluster |
| 18 | +## Step 1: Start OpenSearch locally |
20 | 19 |
|
21 | | -To run OpenSearch in a local development environment, you can use Docker and a preconfigured `docker-compose` file. |
| 20 | +See here for directions to install and run [OpenSearch](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/). |
22 | 21 |
|
23 | | -### a. Create a Docker Compose File |
| 22 | +Run OpenSearch locally and make sure to do the following. |
24 | 23 |
|
25 | | -You can use this sample [file](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-compose-file-for-development) as an example. |
26 | | -Once your `docker-compose.yml` file is created, run the following command to start OpenSearch in the background: |
27 | | - |
28 | | -``` |
29 | | -docker-compose up -d |
30 | | -``` |
31 | | - |
32 | | - |
33 | | -### b. Update cluster settings |
| 24 | +### Update cluster settings |
34 | 25 |
|
35 | 26 | Ensure your cluster is configured to allow registering models. You can do this by updating the cluster settings using the following request: |
36 | 27 |
|
@@ -146,7 +137,7 @@ POST /_plugins/_ml/models/_register |
146 | 137 | "passage_prefix": "passage: ", |
147 | 138 | "all_config": "{ \"_name_or_path\": \"intfloat/multilingual-e5-small\", \"architectures\": [ \"BertModel\" ], \"attention_probs_dropout_prob\": 0.1, \"hidden_size\": 384, \"num_attention_heads\": 12, \"num_hidden_layers\": 12, \"tokenizer_class\": \"XLMRobertaTokenizer\" }" |
148 | 139 | }, |
149 | | - "url": "http://host.docker.internal:8080/intfloat-multilingual-e5-small-onnx.zip" |
| 140 | + "url": "http://localhost:8080/intfloat-multilingual-e5-small-onnx.zip" |
150 | 141 | } |
151 | 142 | ``` |
152 | 143 |
|
@@ -328,7 +319,7 @@ PUT _ingest/pipeline/asymmetric_embedding_ingest_pipeline |
328 | 319 |
|
329 | 320 | ### 2.3 Simulate the pipeline |
330 | 321 |
|
331 | | -Simulate the pipeline by running the following request: |
| 322 | +You can test the pipeline using the simulate endpoint. Simulate the pipeline by running the following request: |
332 | 323 | ``` |
333 | 324 | POST /_ingest/pipeline/asymmetric_embedding_ingest_pipeline/_simulate |
334 | 325 | { |
|
0 commit comments