Skip to content

Commit 7fd162f

Browse files
authored
Merge pull request #68 from TheovanKraay/semantic-search-demo
updates after James' review
2 parents 73ba40d + d28001d commit 7fd162f

File tree

3 files changed

+17
-39
lines changed

3 files changed

+17
-39
lines changed

Python/CosmosDB-NoSQL_SemanticSearchDemo/README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Semantic Search Demo for Azure Cosmos DB
22

3-
This repository contains a Python Streamlit application that demonstrates advanced search capabilities using Azure Cosmos DB, OpenAI embeddings, and semantic reranking. The application showcases vector search, full text search, text ranking, and hybrid search with intelligent semantic reranking to improve result relevance.
3+
This folder contains a Python Streamlit application that demonstrates advanced search capabilities using Azure Cosmos DB and OpenAI. The application showcases vector search, full text search, text ranking, and hybrid search with intelligent semantic reranking to improve result relevance.
44

5-
> **🚨 IMPORTANT NOTICE**: The **Azure Cosmos DB Semantic Reranker** feature used in this demo is currently in **private preview**. To use the semantic reranking functionality, you must request special access from the Azure Cosmos DB team and obtain a private reranker endpoint. The application will work without this feature for all other search capabilities.
5+
> **🚨 IMPORTANT NOTICE**: The **Azure Cosmos DB Semantic Reranker** feature used in this demo is currently in **private preview**. To use the semantic reranking functionality, you must request special access from the Azure Cosmos DB team and obtain a private reranker endpoint. You can sign up here: https://aka.ms/AzureCosmosDB/RerankerPreview. For more information, contact us at [email protected]. The application will work without this feature for all other search capabilities.
66
77
![screenshot](media/screen-shot.png)
88

@@ -14,13 +14,12 @@ This repository contains a Python Streamlit application that demonstrates advanc
1414
- 🔄 **Hybrid search** combining semantic and full text search
1515
- 📊 **Text ranking** for enhanced result ordering
1616
- **🎯 Semantic Reranking** (⚠️ **PRIVATE PREVIEW**):
17-
> **🚨 IMPORTANT**: The Azure Cosmos DB Semantic Reranker is currently in **private preview** and requires special access. Contact the Azure Cosmos DB team to request access and obtain your reranker endpoint before using this feature.
17+
> **🚨 IMPORTANT**: The Azure Cosmos DB Semantic Reranker is currently in **private preview** and requires special access (see above).
1818
- Built-in Azure Cosmos DB SDK semantic reranking
1919
- Interactive UI toggle to enable/disable reranking
2020
- Preserves original metadata while improving result relevance
2121
- Uses DefaultAzureCredential for secure authentication
2222
- **📈 Multiple Index Support**:
23-
- No Index baseline
2423
- QFLAT vector index for balanced performance
2524
- DiskANN vector index for high-scale scenarios
2625
- **🛡️ Robust Error Handling**:
@@ -46,7 +45,7 @@ This repository contains a Python Streamlit application that demonstrates advanc
4645
### 1. Clone and Setup
4746

4847
```sh
49-
git clone https://github.com/TheovanKraay/AzureDataRetrievalAugmentedGenerationSamples.git
48+
git clone https://github.com/microsoft/AzureDataRetrievalAugmentedGenerationSamples.git
5049
cd AzureDataRetrievalAugmentedGenerationSamples/Python/CosmosDB-NoSQL_SemanticSearchDemo
5150
```
5251

Python/CosmosDB-NoSQL_SemanticSearchDemo/src/app/cosmos-app.py

Lines changed: 2 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -174,25 +174,14 @@ def debug_container_capabilities(container, container_name):
174174
]
175175
}
176176

177-
# Create listings_search container without any index
178-
# container_name = 'search'
179-
# st.session_state.cosmos_container = st.session_state.cosmos_database.create_container_if_not_exists(
180-
# id=container_name,
181-
# partition_key=PartitionKey(path="/id"),
182-
# full_text_policy=full_text_policy,
183-
# vector_embedding_policy=vector_embedding_policy#,
184-
# #offer_throughput=1000
185-
# )
186-
187-
188177
# Create containers only if we have a valid database connection
189178
if st.session_state.cosmos_database is not None:
190179
# Create listings_search_qflat container with QFLAT vector index
191180
container_name_qflat = 'search_qflat'
192181
st.session_state.cosmos_container_qflat = st.session_state.cosmos_database.create_container_if_not_exists(
193182
id=container_name_qflat,
194183
partition_key=PartitionKey(path="/id"),
195-
# full_text_policy=full_text_policy, # Temporarily commented out for compatibility
184+
full_text_policy=full_text_policy,
196185
vector_embedding_policy=vector_embedding_policy,
197186
indexing_policy=qflat_indexing_policy,
198187
offer_throughput=400
@@ -203,7 +192,7 @@ def debug_container_capabilities(container, container_name):
203192
st.session_state.cosmos_container_diskann = st.session_state.cosmos_database.create_container_if_not_exists(
204193
id=container_name_diskann,
205194
partition_key=PartitionKey(path="/id"),
206-
# full_text_policy=full_text_policy, # Temporarily commented out for compatibility
195+
full_text_policy=full_text_policy,
207196
vector_embedding_policy=vector_embedding_policy,
208197
indexing_policy=diskann_indexing_policy,
209198
offer_throughput=400

Python/CosmosDB-NoSQL_SemanticSearchDemo/src/data/data-loader.py

Lines changed: 11 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -96,13 +96,12 @@ def initialize_cosmos(database_name):
9696
"path": "/" + cosmos_vector_property,
9797
"type": "quantizedFlat",
9898
}
99+
],
100+
"fullTextIndexes": [
101+
{
102+
"path": "/" + cosmos_full_text_property
103+
}
99104
]
100-
# Note: fullTextIndexes require full_text_policy which is not supported in current SDK version
101-
# "fullTextIndexes": [
102-
# {
103-
# "path": "/" + cosmos_full_text_property
104-
# }
105-
# ]
106105
}
107106
diskann_indexing_policy = {
108107
"includedPaths": [
@@ -116,26 +115,17 @@ def initialize_cosmos(database_name):
116115
"path": "/" + cosmos_vector_property,
117116
"type": "diskANN",
118117
}
118+
],
119+
"fullTextIndexes": [
120+
{
121+
"path": "/" + cosmos_full_text_property
122+
}
119123
]
120-
# Note: fullTextIndexes require full_text_policy which is not supported in current SDK version
121-
# "fullTextIndexes": [
122-
# {
123-
# "path": "/" + cosmos_full_text_property
124-
# }
125-
# ]
126124
}
127125

128126
# Create containers if they don't exist (same as main app)
129127
containers = {}
130-
131-
# Create search container without any index - commented out in main app
132-
# container_name = 'search'
133-
# containers[container_name] = database.create_container_if_not_exists(
134-
# id=container_name,
135-
# partition_key=PartitionKey(path="/id"),
136-
# full_text_policy=full_text_policy,
137-
# vector_embedding_policy=vector_embedding_policy
138-
# )
128+
139129

140130
# Create search_qflat container with QFLAT vector index
141131
container_name_qflat = 'search_qflat'

0 commit comments

Comments
 (0)