Skip to content

Commit caabd60

Browse files
Merge pull request #295553 from flang-msft/fxl---freshness-for-AI-articles
Fxl---freshness for ai articles
2 parents 43545b8 + 9ae6ab3 commit caabd60

File tree

2 files changed

+46
-53
lines changed

2 files changed

+46
-53
lines changed

articles/azure-cache-for-redis/cache-overview-vector-similarity.md

Lines changed: 12 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -7,36 +7,31 @@ ms.collection: ce-skilling-ai-copilot
77
ms.topic: overview
88
ms.custom:
99
- ignite-2024
10-
ms.date: 04/24/2024
10+
ms.date: 02/27/2025
1111
---
1212

1313
# What are Vector Embeddings and Vector Search in Azure Cache for Redis?
1414

1515
Vector similarity search (VSS) has become a popular technology for AI-powered intelligent applications. Azure Cache for Redis can be used as a vector database when combined with models like [Azure OpenAI](/azure/ai-services/openai/overview) for Retrieval-Augmented Generative AI and other analysis scenarios. This article is a high-level introduction to the concept of vector embeddings, vector similarity search, and how Redis can be used as a vector database powering intelligent applications.
1616

17-
For tutorials and sample applications on how to use Azure Cache for Redis and Azure OpenAI to perform vector similarity search, see the following:
17+
For tutorials and sample applications on how to use Enterprise tier or Azure Managed Redis with Azure OpenAI, see the following:
1818

19-
- [Tutorial: Conduct vector similarity search on Azure OpenAI embeddings using Azure Cache for Redis with LangChain](./cache-tutorial-vector-similarity.md)
20-
- [Sample: Using Redis as vector database in a Chatbot application with .NET Semantic Kernel](https://github.com/CawaMS/chatappredis)
21-
- [Sample: Using Redis as semantic cache in a Dall-E powered image gallery with Redis OM for .NET](https://github.com/CawaMS/OutputCacheOpenAI)
19+
- [Tutorial: Conduct vector similarity Enterprise tier or Azure Managed Redis with ddings using LangChain](./cache-tutorial-vector-similarity.md)
20+
- [Sample: Using Redis as semantic cache in a Dall-E powered image gallery with Redis OM for .NET](https://github.com/Azure-Samples/azure-redis-dalle-semantic-caching)
2221

2322
## Scope of Availability
2423

25-
Vector search capabilities in Redis require [Redis Stack](https://redis.io/docs/latest/operate/oss_and_stack/stack-with-enterprise/), specifically the [RediSearch](https://redis.io/docs/interact/search-and-query/) module. This capability is only available in the [Enterprise tiers of Azure Cache for Redis](./cache-redis-modules.md).
24+
Vector search capabilities in Redis require [Redis Stack](https://redis.io/docs/latest/operate/oss_and_stack/stack-with-enterprise/), specifically the [RediSearch](https://redis.io/docs/interact/search-and-query/) module. This capability is only available in the [Enterprise tiers of Azure Cache for Redis](./cache-redis-modules.md) and Azure Managed Redis.
2625

2726
This table contains the information for vector search availability in different tiers.
2827

29-
|Tier | Basic / Standard | Premium |Enterprise | Enterprise Flash | Azure Managed Redis (preview)
30-
|--------- |:------------------:|:----------:|:---------:|:---------:|:---------:|
31-
|Available | No | No | Yes | Yes (preview) |Yes
28+
| Tier | Basic / Standard | Premium | Enterprise | Enterprise Flash | Azure Managed Redis (preview) |
29+
|-----------|:----------------:|:-------:|:----------:|:----------------:|:-----------------------------:|
30+
| Available | No | No | Yes | Yes (preview) | Yes |
3231

3332
## What are vector embeddings?
3433

35-
### Concept
36-
37-
Vector embeddings are a fundamental concept in machine learning and natural language processing that enable the representation of data, such as words, documents, or images as numerical vectors in a high-dimension vector space. The primary idea behind vector embeddings is to capture the underlying relationships and semantics of the data by mapping them to points in this vector space. That means converting your text or images into a sequence of numbers that represents the data, and then comparing the different number sequences. This allows complex data to be manipulated and analyzed mathematically, making it easier to perform tasks like similarity comparison, recommendation, and classification.
38-
39-
<!-- TODO - Add image example -->
34+
Vector embeddings are a fundamental concept in machine learning and natural language processing that enable the representation of data, such as words, documents, or images, as numerical vectors in a high-dimension vector space. The primary idea behind vector embeddings is to capture the underlying relationships and semantics of the data by mapping them to points in this vector space. That means converting your text or images into a sequence of numbers that represents the data, and then comparing the different number sequences. This allows complex data to be manipulated and analyzed mathematically, making it easier to perform tasks like similarity comparison, recommendation, and classification.
4035

4136
Each machine learning model classifies data and produces the vector in a different manner. Furthermore, it's typically not possible to determine exactly what semantic meaning each vector dimension represents. But because the model is consistent between each block of input data, similar words, documents, or images have vectors that are also similar. For example, the words `basketball` and `baseball` have embeddings vectors much closer to each other than a word like `rainforest`.
4237

@@ -81,9 +76,9 @@ Vector similarity search can be used in multiple applications. Some common use-c
8176
- **Semantic Caching**. Reduce the cost and latency of LLMs by caching LLM completions. LLM queries are compared using vector similarity. If a new query is similar enough to a previously cached query, the cached query is returned. [Semantic Caching example using LangChain](https://python.langchain.com/docs/integrations/llm_caching/#redis-cache)
8277
- **LLM Conversation Memory**. Persist conversation history with an LLM as embeddings in a vector database. Your application can use vector search to pull relevant history or "memories" into the response from the LLM. [LLM Conversation Memory example](https://github.com/continuum-llms/chatgpt-memory)
8378

84-
## Why choose Azure Cache for Redis for storing and searching vectors?
79+
## Why choose Azure Redis for storing and searching vectors?
8580

86-
Azure Cache for Redis can be used effectively as a vector database to store embeddings vectors and to perform vector similarity searches. Support for vector storage and search has been available in many key machine learning frameworks like:
81+
Azure Redis caches can be used effectively as a vector database to store embeddings vectors and to perform vector similarity searches. Support for vector storage and search has been available in many key machine learning frameworks like:
8782

8883
- [Semantic Kernel](https://github.com/microsoft/semantic-kernel)
8984
- [LangChain](https://python.langchain.com/docs/integrations/vectorstores/redis)
@@ -117,7 +112,4 @@ There are multiple other solutions on Azure for vector storage and search. Other
117112

118113
## Related content
119114

120-
The best way to get started with embeddings and vector search is to try it yourself!
121-
122-
> [!div class="nextstepaction"]
123-
> [Tutorial: Conduct vector similarity search on Azure OpenAI embeddings using Azure Cache for Redis](./cache-tutorial-vector-similarity.md)
115+
- [Tutorial: Conduct vector similarity search on Azure OpenAI embeddings using Azure Cache for Redis](./cache-tutorial-vector-similarity.md)

articles/azure-cache-for-redis/cache-tutorial-vector-similarity.md

Lines changed: 34 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -7,57 +7,59 @@ ms.collection: ce-skilling-ai-copilot
77
ms.topic: tutorial
88
ms.custom:
99
- ignite-2024
10-
ms.date: 09/15/2023
10+
ms.date: 02/27/2025
1111

1212
#CustomerIntent: As a developer, I want to develop some code using a sample so that I see an example of a vector similarity with an AI-based large language model.
1313
---
1414

15-
# Tutorial: Conduct vector similarity search on Azure OpenAI embeddings using Azure Cache for Redis
15+
# Tutorial: Conduct vector similarity search on Azure OpenAI embeddings using Azure Redis
1616

1717
<!-- cawa - need to mention AMR in this tutorial -->
18-
In this tutorial, you'll walk through a basic vector similarity search use-case. You'll use embeddings generated by Azure OpenAI Service and the built-in vector search capabilities of the Enterprise tier of Azure Cache for Redis to query a dataset of movies to find the most relevant match.
18+
In this tutorial, you walk through a basic vector similarity search use-case. You use embeddings generated by Azure OpenAI Service and the built-in vector search capabilities of the Enterprise tier of Azure Cache for Redis to query a dataset of movies to find the most relevant match.
1919

20-
The tutorial uses the [Wikipedia Movie Plots dataset](https://www.kaggle.com/datasets/jrobischon/wikipedia-movie-plots) that features plot descriptions of over 35,000 movies from Wikipedia covering the years 1901 to 2017.
21-
The dataset includes a plot summary for each movie, plus metadata such as the year the film was released, the director(s), main cast, and genre. You'll follow the steps of the tutorial to generate embeddings based on the plot summary and use the other metadata to run hybrid queries.
20+
The tutorial uses the [Wikipedia Movie Plots dataset](https://www.kaggle.com/datasets/jrobischon/wikipedia-movie-plots) that features plot descriptions of over 35,000 movies from Wikipedia covering the years 1901 to 2017. The dataset includes a plot summary for each movie, plus metadata such as the year the film was released, the director(s), main cast, and genre. You follow the steps of the tutorial to generate embeddings based on the plot summary and use the other metadata to run hybrid queries.
2221

2322
In this tutorial, you learn how to:
2423

2524
> [!div class="checklist"]
26-
> * Create an Azure Cache for Redis instance configured for vector search
27-
> * Install Azure OpenAI and other required Python libraries.
28-
> * Download the movie dataset and prepare it for analysis.
29-
> * Use the **text-embedding-ada-002 (Version 2)** model to generate embeddings.
30-
> * Create a vector index in Azure Cache for Redis
31-
> * Use cosine similarity to rank search results.
32-
> * Use hybrid query functionality through [RediSearch](https://redis.io/docs/interact/search-and-query/) to prefilter the data and make the vector search even more powerful.
25+
> - Create an Azure Cache for Redis instance configured for vector search
26+
> - Install Azure OpenAI and other required Python libraries.
27+
> - Download the movie dataset and prepare it for analysis.
28+
> - Use the **text-embedding-ada-002 (Version 2)** model to generate embeddings.
29+
> - Create a vector index in Azure Cache for Redis
30+
> - Use cosine similarity to rank search results.
31+
> - Use hybrid query functionality through [RediSearch](https://redis.io/docs/interact/search-and-query/) to prefilter the data and make the vector search even more powerful.
3332
3433
>[!IMPORTANT]
35-
>This tutorial will walk you through building a Jupyter Notebook. You can follow this tutorial with a Python code file (.py) and get *similar* results, but you will need to add all of the code blocks in this tutorial into the `.py` file and execute once to see results. In other words, Jupyter Notebooks provides intermediate results as you execute cells, but this is not behavior you should expect when working in a Python code file.
34+
>This tutorial walks you through building a Jupyter Notebook. You can follow this tutorial with a Python code file (.py) and get _similar_ results, but you need to add all of the code blocks in this tutorial into the `.py` file and execute once to see results. In other words, Jupyter Notebooks provides intermediate results as you execute cells, but this is not behavior you should expect when working in a Python code file.
3635
3736
>[!IMPORTANT]
38-
>If you would like to follow along in a completed Jupyter notebook instead, [download the Jupyter notebook file named *tutorial.ipynb*](https://github.com/Azure-Samples/azure-cache-redis-samples/tree/main/tutorial/vector-similarity-search-open-ai) and save it into the new *redis-vector* folder.
37+
>If you would like to follow along in a completed Jupyter notebook instead, [download the Jupyter notebook file named _tutorial.ipynb_](https://github.com/Azure-Samples/azure-cache-redis-samples/tree/main/tutorial/vector-similarity-search-open-ai) and save it into the new _redis-vector_ folder.
3938
4039
## Prerequisites
40+
<!-- Continue here. -->
4141

42-
* An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true)
43-
* Access granted to Azure OpenAI in the desired Azure subscription. Currently, you must apply for access to Azure OpenAI. You can apply for access to Azure OpenAI by completing the form at <a href="https://aka.ms/oai/access" target="_blank">https://aka.ms/oai/access</a>.
44-
* <a href="https://www.python.org/" target="_blank">Python 3.8 or later version</a>
45-
* [Jupyter Notebooks](https://jupyter.org/) (optional)
46-
* An Azure OpenAI resource with the **text-embedding-ada-002 (Version 2)** model deployed. This model is currently only available in [certain regions](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability). See the [resource deployment guide](/azure/ai-services/openai/how-to/create-resource) for instructions on how to deploy the model.
42+
- An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true)
43+
- Access granted to Azure OpenAI in the desired Azure subscription. Currently, you must apply for access to Azure OpenAI. You can apply for access to Azure OpenAI by completing the form at [https://aka.ms/oai/access](https://aka.ms/oai/access). <!-- I don't know if this is still true -->
44+
- [Python 3.8 or later version](https://www.python.org/)
45+
- [Jupyter Notebooks](https://jupyter.org/) (optional)
46+
- An Azure OpenAI resource with the **text-embedding-ada-002 (Version 2)** model deployed. This model is currently only available in [certain regions](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability). See the [resource deployment guide](/azure/ai-services/openai/how-to/create-resource) for instructions on how to deploy the model.
4747

4848
## Create an Azure Cache for Redis Instance
4949

50-
1. Follow the [Quickstart: Create a Redis Enterprise cache](quickstart-create-redis-enterprise.md) guide. On the **Advanced** page, make sure that you've added the **RediSearch** module and have chosen the **Enterprise** Cluster Policy. All other settings can match the default described in the quickstart.
50+
1. Follow the [Quickstart: Create a Redis Enterprise cache](quickstart-create-redis-enterprise.md) guide, but make sure you add the RedisSearch module at create time.
51+
52+
1. On the **Advanced** page, make sure that you've added the **RediSearch** module and have chosen the **Enterprise** Cluster Policy. All other settings can match the default described in the quickstart.
5153

5254
It takes a few minutes for the cache to create. You can move on to the next step in the meantime.
5355

54-
:::image type="content" source="media/cache-create/enterprise-tier-basics.png" alt-text="Screenshot showing the Enterprise tier Basics tab filled out.":::
56+
:::image type="content" source="media/cache-create/enterprise-tier-basics.png" alt-text="Screenshot showing the Enterprise tier Basics tab filled out.":::
5557

5658
## Set up your development environment
5759

58-
1. Create a folder on your local computer named *redis-vector* in the location where you typically save your projects.
60+
1. Create a folder on your local computer named _redis-vector_ in the location where you typically save your projects.
5961

60-
1. Create a new python file (*tutorial.py*) or Jupyter notebook (*tutorial.ipynb*) in the folder.
62+
1. Create a new python file (_tutorial.py_) or Jupyter notebook (_tutorial.ipynb_) in the folder.
6163

6264
1. Install the required Python packages:
6365

@@ -71,9 +73,9 @@ In this tutorial, you learn how to:
7173

7274
1. Sign in or register with Kaggle. Registration is required to download the file.
7375

74-
1. Select the **Download** link on Kaggle to download the *archive.zip* file.
76+
1. Select the **Download** link on Kaggle to download the _archive.zip_ file.
7577

76-
1. Extract the *archive.zip* file and move the *wiki_movie_plots_deduped.csv* into the *redis-vector* folder.
78+
1. Extract the _archive.zip_ file and move the _wiki_movie_plots_deduped.csv_ into the _redis-vector_ folder.
7779

7880
## Import libraries and set up connection information
7981

@@ -170,7 +172,6 @@ Next, you'll read the csv file into a pandas DataFrame.
170172
def normalize_text(s, sep_token = " \n "):
171173
s = re.sub(r'\s+', ' ', s).strip()
172174
s = re.sub(r". ,","",s)
173-
# remove all instances of multiple spaces
174175
s = s.replace("..",".")
175176
s = s.replace(". .",".")
176177
s = s.replace("\n", "")
@@ -207,7 +208,7 @@ Next, you'll read the csv file into a pandas DataFrame.
207208
208209
## Load DataFrame into LangChain
209210

210-
Load the DataFrame into LangChain using the `DataFrameLoader` class. Once the data is in LangChain documents, it's far easier to use LangChain libraries to generate embeddings and conduct similarity searches. Set *Plot* as the `page_content_column` so that embeddings are generated on this column.
211+
Load the DataFrame into LangChain using the `DataFrameLoader` class. Once the data is in LangChain documents, it's far easier to use LangChain libraries to generate embeddings and conduct similarity searches. Set _Plot_ as the `page_content_column` so that embeddings are generated on this column.
211212

212213
1. Add the following code to a new code cell and execute it:
213214

@@ -338,9 +339,9 @@ With Azure Cache for Redis and Azure OpenAI Service, you can use embeddings and
338339

339340
## Related Content
340341

341-
* [Learn more about Azure Cache for Redis](cache-overview.md)
342-
* Learn more about Azure Cache for Redis [vector search capabilities](./cache-overview-vector-similarity.md)
343-
* Learn more about [embeddings generated by Azure OpenAI Service](/azure/ai-services/openai/concepts/understand-embeddings)
344-
* Learn more about [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity)
345-
* [Read how to build an AI-powered app with OpenAI and Redis](https://techcommunity.microsoft.com/blog/azuredevcommunityblog/vector-similarity-search-with-azure-cache-for-redis-enterprise/3822059)
346-
* [Build a Q&A app with semantic answers](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna)
342+
- [Learn more about Azure Cache for Redis](cache-overview.md)
343+
- Learn more about Azure Cache for Redis [vector search capabilities](./cache-overview-vector-similarity.md)
344+
- Learn more about [embeddings generated by Azure OpenAI Service](/azure/ai-services/openai/concepts/understand-embeddings)
345+
- Learn more about [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity)
346+
- [Read how to build an AI-powered app with OpenAI and Redis](https://techcommunity.microsoft.com/blog/azuredevcommunityblog/vector-similarity-search-with-azure-cache-for-redis-enterprise/3822059)
347+
- [Build a Q&A app with semantic answers](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna)

0 commit comments

Comments
 (0)