Skip to content

Commit 870c394

Browse files
committed
Updates from 2025-12-02
1 parent a65a25d commit 870c394

File tree

46 files changed

+1903
-1162
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1903
-1162
lines changed

diskann/00_Introduction/README.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,23 @@
77
- [Visual Studio Code](https://code.visualstudio.com/download)
88
- [Python extension for VS Code](https://marketplace.visualstudio.com/items?itemName=ms-python.python)
99
- [Jupyter Notebook extension for VS Code](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter)
10-
- [Docker Desktop](https://www.docker.com/products/docker-desktop/) with [WSL 2 backend (if on Windows)](https://learn.docker.com/desktop/wsl/)
10+
- [Docker Desktop](https://www.docker.com/products/docker-desktop/) with [WSL 2 backend (if on Windows)](https://docs.docker.com/desktop/features/wsl/)
1111
- [Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli)
1212
- [Bicep CLI](https://learn.microsoft.com/azure/azure-resource-manager/bicep/install#install-manually)
1313
- [Powershell](https://learn.microsoft.com/powershell/scripting/install/installing-powershell?view=powershell-7.5)
1414

1515
## Why use this guide?
1616

17-
The future of software involves combining AI and data services, also known as intelligent applications. This guide is for developers looking to implement intelligent applications quickly while leveraging existing skills. The content will focus on the developer journey implementing an Azure-based AI-enabled GPT-based chat application that is augmented using data stored in Azure Cosmos DB for NoSQL while leveraging Azure OpenAI services.
17+
The future of software involves combining AI and data services, also known as intelligent applications.
18+
This guide is for developers looking to implement intelligent applications quickly while leveraging existing skills.
19+
The content will focus on the developer journey implementing an Azure-based AI-enabled GPT-based chat application that is augmented using data stored in Azure Cosmos DB for NoSQL while leveraging Azure OpenAI services.
1820

1921
## Introduction
2022

21-
This guide will walks through the creating intelligent solutions that combines Azure Cosmos DB for NoSQL with vector search capabilities powered by DiskANN and document retrieval with Azure OpenAI services to build a chat bot experience. The guide includes labs that build and deploy a sample chat app using these technologies, with a focus on Azure Cosmos DB for NoSQL, vector search powered by DiskANN, and Azure OpenAI using the Python programming language. For those new to using Azure OpenAI and Vector Search technologies, the guide includes explanations of the core concepts and techniques used when implementing these technologies.
23+
This guide will walks through the creating intelligent solutions that combines Azure Cosmos DB for NoSQL with vector search capabilities powered by DiskANN and document retrieval with Azure OpenAI services to build a chat bot experience.
24+
The guide includes labs that build and deploy a sample chat app using these technologies, with a focus on Azure Cosmos DB for NoSQL, vector search powered by DiskANN, and Azure OpenAI using the Python programming language.
25+
For those new to using Azure OpenAI and Vector Search technologies, the guide includes explanations of the core concepts and techniques used when implementing these technologies.
26+
27+
> **Note:** This developer guide is targeted towards Python developers.
28+
29+
If you are a Node.js developer, then you may be interested in the Node.js version here: [https://github.com/AzureCosmosDB/Azure-OpenAI-Node.js-Developer-Guide](https://github.com/AzureCosmosDB/Azure-OpenAI-Node.js-Developer-Guide)

diskann/01_Azure_Overview/README.md

Lines changed: 77 additions & 34 deletions
Large diffs are not rendered by default.
Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,38 @@
11
# Overview of Azure Cosmos DB
22

3-
[Azure Cosmos DB](https://learn.microsoft.com/azure/cosmos-db/introduction) is a globally distributed database for storing and querying both NoSQL and vector data, with a serverless option. It has multiple APIs, the most notable being the native NoSQL document API and MongoDB API. It provides turnkey global distribution, elastic and dynamic scaling of throughput and storage, and a comprehensive SLA (service level agreement) for single-digit millisecond latency and 99.999% high-availability.
3+
[Azure Cosmos DB](https://learn.microsoft.com/azure/cosmos-db/introduction) is a globally distributed, multi-model database service for both NoSQL and relational workloads.
4+
It supports multiple APIs: NoSQL, MongoDB, PostgreSQL, Cassandra, Gremlin, and Table—covering document, relational, column-family, graph, and key-value data models.
5+
The service offers turnkey global distribution with elastic scaling of throughput and storage.
6+
It delivers single-digit millisecond latencies at the 99th percentile and guarantees high availability through multi-homing capabilities.
7+
Azure Cosmos DB provides comprehensive service level agreements (SLAs) covering throughput, latency, availability, and consistency—a unique combination among cloud database services.
48

59
## Azure Cosmos DB and AI
610

7-
The surge of AI-powered applications has led to the need to integrate operational data from multiple data stores, introducing another layer of complexity as each data store tends to have its own workflow and operational performance. Azure Cosmos DB simplifies this process by providing a unified platform for all data types, including AI data. In particular, its support for vector storage and retrieval is a game-changer for generative AI applications. By representing complex data elements like text, images, or sound as high-dimensional vectors, Azure Cosmos DB allows for efficient storage, indexing, and querying of these vectors, which is crucial for many generative AI tasks.
8-
9-
Unlike traditional databases requiring separate workarounds for different data types, Azure Cosmos DB supports multiple data models within a single, integrated environment. This simplification means you can leverage the same robust platform for all your AI data needs. Many AI applications rely on external stand-alone vector stores, which can be cumbersome to manage and maintain. Azure Cosmos DB's native support for vector storage and retrieval eliminates the need for these external stores as all the application's data is located in a single place thus streamlining the development and deployment of AI applications. These features enable the building, deploying, and scaling of AI applications to be more efficient and reliable, making Azure Cosmos DB an ideal choice for handling the complex data requirements of modern generative AI solutions.
11+
The surge of AI-powered applications has led to the need to integrate data from multiple data stores, introducing another layer of complexity as each data store tends to have its own workflow and operational performance.
12+
Azure Cosmos DB simplifies this process by providing a unified platform for all data types, including AI data.
13+
Azure Cosmos DB supports relational, document, vector, key-value, graph, and table data models, making it an ideal platform for AI applications.
14+
The wide array of data model support combined with guaranteed high availability, high throughput, low latency, and tunable consistency are huge advantages when building these types of applications.
1015

1116
## Azure Cosmos DB for NoSQL
1217

1318
The focus for this developer guide is [Azure Cosmos DB for NoSQL](https://learn.microsoft.com/azure/cosmos-db/nosql/) and [Vector Search](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search).
1419

1520
### Azure Cosmos DB for NoSQL capacity modes
1621

17-
Azure Cosmos DB offers three capacity modes: provisioned throughput, serverless and autoscale modes. creating an Azure Cosmos DB account, it's essential to evaluate the workload's characteristics in order to choose the appropriate mode to optimize both performance and cost efficiency.
22+
Azure Cosmos DB offers three capacity modes: provisioned throughput, serverless and autoscale modes.
23+
When creating an Azure Cosmos DB account, it's essential to evaluate the workload's characteristics in order to choose the appropriate mode to optimize both performance and cost efficiency.
1824

19-
[**Serverless mode**](https://learn.microsoft.com/azure/cosmos-db/serverless) offers a more flexible and pay-as-you-go approach, where only the Request Units consumed are billed. This is particularly advantageous for applications with sporadic or unpredictable usage patterns, as it eliminates the need to provision resources upfront.
25+
[**Serverless mode**](https://learn.microsoft.com/azure/cosmos-db/serverless) offers a more flexible and pay-as-you-go approach, where only the Request Units consumed are billed.
26+
This is particularly advantageous for applications with sporadic or unpredictable usage patterns, as it eliminates the need to provision resources upfront.
2027

21-
[**Provisioned throughput mode**](https://learn.microsoft.com/azure/cosmos-db/set-throughput) allocates a fixed amount of resources, measured in [Request Units per second (RUs/s)](https://learn.microsoft.com/azure/cosmos-db/request-units), which is ideal for applications with predictable and steady workloads. This ensures consistent performance and can be more cost-effective when there is a constant or high demand for database operations. RU/s can be set at both the database and container levels, allowing for fine-grained control over resource allocation.
28+
[**Provisioned throughput mode**](https://learn.microsoft.com/azure/cosmos-db/set-throughput) allocates a fixed amount of resources, measured in [Request Units per second (RUs/s)](https://learn.microsoft.com/azure/cosmos-db/request-units), which is ideal for applications with predictable and steady workloads.
29+
This ensures consistent performance and can be more cost-effective when there is a constant or high demand for database operations.
30+
RU/s can be set at both the database and container levels, allowing for fine-grained control over resource allocation.
2231

23-
[**Autoscale mode**](https://learn.microsoft.com/azure/cosmos-db/provision-throughput-autoscale) builds upon the provisioned throughput mode but allows for the database or container automatically and instantly scale up or down resources based on demand, ensuring that the application can handle varying workloads efficiently. When configuring autoscale, a maximum (Tmax) value threshold is set for a predictable maximum cost. This mode is suitable for applications with fluctuating usage patterns or infrequently used applications.
32+
[**Autoscale mode**](https://learn.microsoft.com/azure/cosmos-db/provision-throughput-autoscale) builds upon the provisioned throughput mode but allows for the database or container automatically and instantly scale up or down resources based on demand, ensuring that the application can handle varying workloads efficiently.
33+
When configuring autoscale, a maximum (Tmax) value threshold is set for a predictable maximum cost.
34+
This mode is suitable for applications with fluctuating usage patterns or infrequently used applications.
2435

25-
[**Dynamic scaling**](https://learn.microsoft.com/azure/cosmos-db/autoscale-per-partition-region) allows for the automatic and independent scaling of non-uniform workloads across regions and partitions according to usage patterns. For instance, in a disaster recovery configuration with two regions, the primary region may experience high traffic while the secondary region can scale down to idle, thereby saving costs. This approach is also highly effective for multi-regional applications, where traffic patterns fluctuate based on the time of day in each region.
36+
[**Dynamic scaling**](https://learn.microsoft.com/azure/cosmos-db/autoscale-per-partition-region) allows for the automatic and independent scaling of non-uniform workloads across regions and partitions according to usage patterns.
37+
For instance, in a disaster recovery configuration with two regions, the primary region may experience high traffic while the secondary region can scale down to idle, thereby saving costs.
38+
This approach is also highly effective for multi-regional applications, where traffic patterns fluctuate based on the time of day in each region.

0 commit comments

Comments
 (0)