You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-cache-for-redis/cache-tutorial-vector-similarity.md
+29-28Lines changed: 29 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,42 +22,44 @@ The tutorial uses the [Wikipedia Movie Plots dataset](https://www.kaggle.com/dat
22
22
In this tutorial, you learn how to:
23
23
24
24
> [!div class="checklist"]
25
-
> * Create an Azure Cache for Redis instance configured for vector search
26
-
> * Install Azure OpenAI and other required Python libraries.
27
-
> * Download the movie dataset and prepare it for analysis.
28
-
> * Use the **text-embedding-ada-002 (Version 2)** model to generate embeddings.
29
-
> * Create a vector index in Azure Cache for Redis
30
-
> * Use cosine similarity to rank search results.
31
-
> * Use hybrid query functionality through [RediSearch](https://redis.io/docs/interact/search-and-query/) to prefilter the data and make the vector search even more powerful.
25
+
> - Create an Azure Cache for Redis instance configured for vector search
26
+
> - Install Azure OpenAI and other required Python libraries.
27
+
> - Download the movie dataset and prepare it for analysis.
28
+
> - Use the **text-embedding-ada-002 (Version 2)** model to generate embeddings.
29
+
> - Create a vector index in Azure Cache for Redis
30
+
> - Use cosine similarity to rank search results.
31
+
> - Use hybrid query functionality through [RediSearch](https://redis.io/docs/interact/search-and-query/) to prefilter the data and make the vector search even more powerful.
32
32
33
33
>[!IMPORTANT]
34
-
>This tutorial walks you through building a Jupyter Notebook. You can follow this tutorial with a Python code file (.py) and get *similar* results, but you need to add all of the code blocks in this tutorial into the `.py` file and execute once to see results. In other words, Jupyter Notebooks provides intermediate results as you execute cells, but this is not behavior you should expect when working in a Python code file.
34
+
>This tutorial walks you through building a Jupyter Notebook. You can follow this tutorial with a Python code file (.py) and get _similar_ results, but you need to add all of the code blocks in this tutorial into the `.py` file and execute once to see results. In other words, Jupyter Notebooks provides intermediate results as you execute cells, but this is not behavior you should expect when working in a Python code file.
35
35
36
36
>[!IMPORTANT]
37
-
>If you would like to follow along in a completed Jupyter notebook instead, [download the Jupyter notebook file named *tutorial.ipynb*](https://github.com/Azure-Samples/azure-cache-redis-samples/tree/main/tutorial/vector-similarity-search-open-ai) and save it into the new *redis-vector* folder.
37
+
>If you would like to follow along in a completed Jupyter notebook instead, [download the Jupyter notebook file named _tutorial.ipynb_](https://github.com/Azure-Samples/azure-cache-redis-samples/tree/main/tutorial/vector-similarity-search-open-ai) and save it into the new _redis-vector_ folder.
38
38
39
39
## Prerequisites
40
40
<!-- Continue here. -->
41
41
42
-
* An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true)
43
-
* Access granted to Azure OpenAI in the desired Azure subscription. Currently, you must apply for access to Azure OpenAI. You can apply for access to Azure OpenAI by completing the form at <ahref="https://aka.ms/oai/access"target="_blank">https://aka.ms/oai/access</a>.
44
-
* <ahref="https://www.python.org/"target="_blank">Python 3.8 or later version</a>
* An Azure OpenAI resource with the **text-embedding-ada-002 (Version 2)** model deployed. This model is currently only available in [certain regions](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability). See the [resource deployment guide](/azure/ai-services/openai/how-to/create-resource) for instructions on how to deploy the model.
42
+
- An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true)
43
+
- Access granted to Azure OpenAI in the desired Azure subscription. Currently, you must apply for access to Azure OpenAI. You can apply for access to Azure OpenAI by completing the form at [https://aka.ms/oai/access](https://aka.ms/oai/access). <!-- I don't know if this is still true -->
44
+
-[Python 3.8 or later version](https://www.python.org/)
- An Azure OpenAI resource with the **text-embedding-ada-002 (Version 2)** model deployed. This model is currently only available in [certain regions](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability). See the [resource deployment guide](/azure/ai-services/openai/how-to/create-resource) for instructions on how to deploy the model.
47
47
48
48
## Create an Azure Cache for Redis Instance
49
49
50
-
1. Follow the [Quickstart: Create a Redis Enterprise cache](quickstart-create-redis-enterprise.md) guide. On the **Advanced** page, make sure that you've added the **RediSearch** module and have chosen the **Enterprise** Cluster Policy. All other settings can match the default described in the quickstart.
50
+
1. Follow the [Quickstart: Create a Redis Enterprise cache](quickstart-create-redis-enterprise.md) guide, but make sure you add the RedisSearch module at create time.
51
+
52
+
1. On the **Advanced** page, make sure that you've added the **RediSearch** module and have chosen the **Enterprise** Cluster Policy. All other settings can match the default described in the quickstart.
51
53
52
54
It takes a few minutes for the cache to create. You can move on to the next step in the meantime.
1. Create a folder on your local computer named *redis-vector* in the location where you typically save your projects.
60
+
1. Create a folder on your local computer named _redis-vector_ in the location where you typically save your projects.
59
61
60
-
1. Create a new python file (*tutorial.py*) or Jupyter notebook (*tutorial.ipynb*) in the folder.
62
+
1. Create a new python file (_tutorial.py_) or Jupyter notebook (_tutorial.ipynb_) in the folder.
61
63
62
64
1. Install the required Python packages:
63
65
@@ -71,9 +73,9 @@ In this tutorial, you learn how to:
71
73
72
74
1. Sign in or register with Kaggle. Registration is required to download the file.
73
75
74
-
1. Select the **Download** link on Kaggle to download the *archive.zip* file.
76
+
1. Select the **Download** link on Kaggle to download the _archive.zip_ file.
75
77
76
-
1. Extract the *archive.zip* file and move the *wiki_movie_plots_deduped.csv* into the *redis-vector* folder.
78
+
1. Extract the _archive.zip_ file and move the _wiki_movie_plots_deduped.csv_ into the _redis-vector_ folder.
77
79
78
80
## Import libraries and set up connection information
79
81
@@ -170,7 +172,6 @@ Next, you'll read the csv file into a pandas DataFrame.
170
172
defnormalize_text(s, sep_token="\n"):
171
173
s = re.sub(r'\s+', '', s).strip()
172
174
s = re.sub(r". ,","",s)
173
-
# remove all instances of multiple spaces
174
175
s = s.replace("..",".")
175
176
s = s.replace(". .",".")
176
177
s = s.replace("\n", "")
@@ -207,7 +208,7 @@ Next, you'll read the csv file into a pandas DataFrame.
207
208
208
209
## Load DataFrame into LangChain
209
210
210
-
Load the DataFrame into LangChain using the `DataFrameLoader` class. Once the data is in LangChain documents, it's far easier to use LangChain libraries to generate embeddings and conduct similarity searches. Set *Plot* as the `page_content_column` so that embeddings are generated on this column.
211
+
Load the DataFrame into LangChain using the `DataFrameLoader` class. Once the data is in LangChain documents, it's far easier to use LangChain libraries to generate embeddings and conduct similarity searches. Set _Plot_ as the `page_content_column` so that embeddings are generated on this column.
211
212
212
213
1. Add the following code to a new code cell and execute it:
213
214
@@ -338,9 +339,9 @@ With Azure Cache for Redis and Azure OpenAI Service, you can use embeddings and
338
339
339
340
## Related Content
340
341
341
-
*[Learn more about Azure Cache for Redis](cache-overview.md)
342
-
* Learn more about Azure Cache for Redis [vector search capabilities](./cache-overview-vector-similarity.md)
343
-
* Learn more about [embeddings generated by Azure OpenAI Service](/azure/ai-services/openai/concepts/understand-embeddings)
344
-
* Learn more about [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity)
345
-
*[Read how to build an AI-powered app with OpenAI and Redis](https://techcommunity.microsoft.com/blog/azuredevcommunityblog/vector-similarity-search-with-azure-cache-for-redis-enterprise/3822059)
346
-
*[Build a Q&A app with semantic answers](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna)
342
+
-[Learn more about Azure Cache for Redis](cache-overview.md)
343
+
- Learn more about Azure Cache for Redis [vector search capabilities](./cache-overview-vector-similarity.md)
344
+
- Learn more about [embeddings generated by Azure OpenAI Service](/azure/ai-services/openai/concepts/understand-embeddings)
345
+
- Learn more about [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity)
346
+
-[Read how to build an AI-powered app with OpenAI and Redis](https://techcommunity.microsoft.com/blog/azuredevcommunityblog/vector-similarity-search-with-azure-cache-for-redis-enterprise/3822059)
347
+
-[Build a Q&A app with semantic answers](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna)
0 commit comments