Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ Garnet is a new remote cache-store from Microsoft Research, that offers several

This repo contains the code to build and run Garnet. For more information and documentation, check out our website at [https://microsoft.github.io/garnet](https://microsoft.github.io/garnet).

**Looking for a fully managed service?** [Azure Cosmos DB Garnet Cache](https://microsoft.github.io/garnet/docs/azure/overview) provides Garnet as a fully managed, enterprise-ready caching solution with built-in high availability, performance guarantees and zero infrastructure management.

## Feature Summary

Garnet implements a wide range of APIs including raw strings (e.g., gets, sets, and key expiration), analytical (e.g., HyperLogLog and Bitmap), and object (e.g., sorted sets and lists)
Expand Down
286 changes: 286 additions & 0 deletions website/docs/azure/api-compatibility.md

Large diffs are not rendered by default.

191 changes: 191 additions & 0 deletions website/docs/azure/cluster-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
---
id: cluster-configuration
sidebar_label: Cluster Configuration
title: Cluster Configuration for Azure Cosmos DB Garnet Cache
---

# Cluster Configuration for Azure Cosmos DB Garnet Cache

## Available Tiers

Azure Cosmos DB Garnet Cache lets you choose the underlying [Azure Virtual Machine](https://learn.microsoft.com/azure/virtual-machines/sizes/overview) that your cache nodes will be provisioned on. The specs offered by cache nodes mirror the Azure virtual machine itself. Garnet doesn't limit the number of client connections that can be made on any node for any SKU. When choosing the right tier and SKU for your workload, consider that roughly 30% of memory on each node will be reserved for metadata and processing requests. Smaller SKUs in each tier are classified as [dev/test](#dev-test) while larger SKUs are designed for [production](#production) workloads.

Every node also has a [Premium SSD Managed Disk](https://learn.microsoft.com/azure/virtual-machines/disks-types#premium-ssds) provisioned for [data persistence](./resiliency.md#data-persistence). The disk size is not configurable and represents 2x the total memory of each node. The Managed Disk SKU provisioned for each option is in the table below, and is priced at the [Azure Managed Disk price](https://azure.microsoft.com/pricing/details/managed-disks).

The pricing model for cache nodes is instance-based and there are no licensing fees. For information about pricing for specific SKUs, reach out to [[email protected]](mailto:[email protected]).

### General Purpose

Balanced performance tier suitable for most caching workloads with a good balance of compute, memory, and network resources.

- **Use Cases**: Balanced workloads, general caching, development and testing

|SKU |vCPUs |Memory (GB) |Network bandwidth (MB/s) |Premium SSD Managed Disk |Cluster Type |
|----|------|------------|-------------------------|-------------------------|-------------|
|Standard_B2ls_v2 |2 |4 |6250 |P2 |Dev/ Test |
|Standard_B2als_v2|2 |4 |6250 |P2 |Dev/ Test |
|Standard_D2s_v5 |2 |8 |12500 |P3 |Dev/ Test |
|Standard_D4s_v5 |4 |16 |12500 |P4 |Dev/ Test |
|Standard_D8s_v5 |8 |32 |12500 |P6 |Production |
|Standard_D16s_v5 |16 |64 |12500 |P10 |Production |
|Standard_D32s_v5 |32 |128 |16000 |P15 |Production |
|Standard_D2as_v5 |2 |8 |12500 |P3 |Dev/ Test |
|Standard_D4as_v5 |4 |16 |12500 |P4 |Dev/ Test |
|Standard_D8as_v5 |8 |32 |12500 |P6 |Production |
|Standard_D16as_v5|16 |64 |12500 |P10 |Production |
|Standard_D32as_v5|32 |128 |16000 |P15 |Production |
|Standard_D2s_v4 |2 |8 |5000 |P3 |Dev/ Test |
|Standard_D4s_v4 |4 |16 |10000 |P4 |Dev/ Test |
|Standard_D8s_v4 |8 |32 |12500 |P6 |Production |
|Standard_D16s_v4 |16 |64 |12500 |P10 |Production |
|Standard_D32s_v4 |32 |128 |16000 |P15 |Production |

### Memory Optimized

High-memory tier designed for workloads requiring large in-memory datasets with optimized memory-to-CPU ratios.

- **Use Cases**: Large datasets, gaming leaderboards, vector search workloads

|SKU |vCPUs |Memory (GB) |Network bandwidth (MB/s) |Premium SSD Managed Disk |
|----|------|------------|-------------------------|-------------------------|
|Standard_E2s_v5 |2 |16 |12500 |P4 |Dev/ Test |
|Standard_E4s_v5 |4 |32 |12500 |P6 |Dev/ Test |
|Standard_E8s_v5 |8 |64 |12500 |P10 |Production |
|Standard_E16s_v5 |16 |128 |12500 |P15 |Production |
|Standard_E20s_v5 |20 |160 |12500 |P20 |Production |
|Standard_E32s_v5 |32 |256 |16000 |P20 |Production |
|Standard_E2as_v5 |2 |16 |12500 |P4 |Dev/ Test |
|Standard_E4as_v5 |4 |32 |12500 |P6 |Dev/ Test |
|Standard_E8as_v5 |8 |64 |12500 |P10 |Production |
|Standard_E16as_v5|16 |128 |12500 |P15 |Production |
|Standard_E20as_v5|20 |160 |12500 |P20 |Production |
|Standard_E32as_v5|32 |256 |16000 |P20 |Production |
|Standard_E2s_v4 |2 |16 |5000 |P4 |Dev/ Test |
|Standard_E4s_v4 |4 |32 |10000 |P6 |Dev/ Test |
|Standard_E8s_v4 |8 |64 |12500 |P10 |Production |
|Standard_E16s_v4 |16 |128 |12500 |P50 |Production |
|Standard_E20s_v4 |20 |160 |10000 |P20 |Production |
|Standard_E32s_v4 |32 |256 |16000 |P20 |Production |


### Cluster Types

There are two cluster types to choose from which determine the SKUs available and the performance guarantees offered.

#### Dev/ Test

Development and testing SKUs are designed for non-production workloads with cost optimization and flexibility in mind. They are a good fit for feature testing and integration validation and are offered without SLAs. You may see lower throughput and higher latencies when using these SKUs. All features, including scaling out across shards, are available on Dev/ Test SKUs.

#### Production

Production SKUs are configured for high availability, performance, and reliability. They are a good fit for mission critical applications that need high throughput and consistent low latency.


## Scaling Options

Azure Cosmos DB Garnet Cache provides flexible scaling options to meet your application's changing demands. Understanding when and how to scale your cache cluster is essential for maintaining optimal performance while controlling costs.

### Choosing Your Scaling Strategy

The decision between vertical and horizontal scaling depends on your specific workload characteristics and performance requirements. Vertical scaling offers simplicity and is ideal when you need more resources per node, while horizontal scaling provides better distribution and resilience for high-throughput scenarios.

#### Vertical Scaling (Scale Up/Down)

Vertical scaling involves changing the SKU of your existing cache nodes to increase or decrease their individual capacity. This approach maintains your current cluster topology while providing more or fewer resources per node. You can scale up SKU size in place within the same tier and generation.

**When to Scale Up:**
Vertical scaling is most effective when your workload benefits from having more resources concentrated on fewer nodes. This approach reduces network overhead between nodes and simplifies data management. Consider scaling up when you need increased memory capacity for larger datasets or higher CPU performance for complex operations.

Vector search workloads are particularly well-suited for vertical scaling because they benefit significantly from having the entire dataset available on a single node. Vector similarity searches require access to large portions of the dataset to compute accurate results, and distributing vectors across multiple nodes can introduce latency and complexity. By scaling up to larger SKUs, vector search applications can maintain all vectors in memory on a single node, enabling faster index traversal and more efficient similarity computations.

**Benefits of Vertical Scaling:**
The primary advantage of vertical scaling is operational simplicity, as it maintains your existing cluster topology while providing enhanced performance.

#### Horizontal Scaling (Scale Out/In)

Horizontal scaling involves adding or removing nodes from your cluster to distribute load across more instances. You can scale horizontally by adding more shards to increase memory footprint and write throughput, or by increasing the replication factor to improve read throughput and availability.

**When to Scale Out:**
Horizontal scaling becomes essential when your workload exceeds the capacity limits of individual nodes or when you need to distribute load for better performance. This approach is particularly effective for applications with high concurrent user loads or when you need to improve read performance through additional replica.

**Scaling with Shards vs Replicas:**
Adding shards increases your total memory capacity and write throughput by distributing data across multiple primary nodes. Each shard handles a portion of your keyspace, allowing for parallel processing of operations. Alternatively, adding replicas primarily improves read throughput and provides better availability, as read operations can be distributed across multiple copies of your data. The [replication factor](./resiliency.md#replication) you choose directly impacts both performance and resiliency characteristics of your cluster.

**Benefits of Horizontal Scaling:**
Horizontal scaling provides superior fault tolerance since the failure of individual nodes has less impact on overall system availability. This approach also offers better resource utilization efficiency and can handle virtually unlimited growth by continuously adding nodes.

### How to Scale

The **Settings > Cluster Explorer** page of the [Azure portal](https://aka.ms/garnet-portal) allows you to scale your cluster both vertically and horizontally. The Azure Cosmos DB Garnet Cache is in an expanded Private Preview and you must access the Azure portal through this link to manage your caches.

![Cluster Explorer](../../static/img/azure/cluster-explorer.png)

You can increase the shard count to scale in/ out, or change the SKU size to scale down/ up. Replication factor can only be configured during cluster provisioning and cannot be updated in place on existing clusters.

![Scale Cluster](../../static/img/azure/scale-cluster.png)

### Right-Sizing Your Deployment

You can optimize the size of your Azure Cosmos DB Garnet Cache by monitoring and adjusting based on actual usage patterns. Starting with conservative estimates and scaling based on observed metrics typically provides the most cost-effective approach while ensuring performance requirements are met.

We recommend beginning your deployment with a smaller tier that meets your initial requirements, then monitor key metrics such as memory utilization, CPU usage, and command processing rates. Regular review of these metrics allows you to make informed decisions about when and how to scale your deployment. Watch for sustained high memory utilization that might indicate a need for additional capacity, increased latency that could benefit from more processing power, or uneven load distribution that might be addressed through horizontal scaling. The key is to identify trends before they impact user experience, allowing for proactive scaling rather than reactive responses to performance issues.


## Regional availability

Each Azure Cosmos DB Garnet Cache can be provisioned in a single region. It is available in multiple Azure regions worldwide, with ongoing expansion to additional regions. The availability of each SKU in a given region depends on the Azure Virtual Machine regional availability. You can verify which SKUs are available in each region [here](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/table).

Additionally, you can configure availability zones during provisioning in supported Azure regions where there is capacity for your chosen SKU. See the list of [Azure regions with availability zone support](https://learn.microsoft.com/azure/reliability/regions-list).

| Geography | Region | Region Name |
|-----------|--------|-------------|
| **Americas** | canadacentral | Canada Central |
| | canadaeast | Canada East |
| | centralus | Central US |
| | eastus | East US |
| | eastus2 | East US 2 |
| | northcentralus | North Central US |
| | southcentralus | South Central US |
| | westcentralus | West Central US |
| | westus | West US |
| | westus2 | West US 2 |
| | westus3 | West US 3 |
| | brazilsouth | Brazil South |
| | brazilsoutheast | Brazil Southeast |
| **Europe** | northeurope | North Europe |
| | westeurope | West Europe |
| | francecentral | France Central |
| | germanynorth | Germany North |
| | germanywestcentral | Germany West Central |
| | italynorth | Italy North |
| | norwayeast | Norway East |
| | norwaywest | Norway West |
| | swedencentral | Sweden Central |
| | swedensouth | Sweden South |
| | switzerlandnorth | Switzerland North |
| | switzerlandwest | Switzerland West |
| | uksouth | UK South |
| | ukwest | UK West |
| **Africa** | southafricanorth | South Africa North |
| | southafricawest | South Africa West |
| **Middle East** | uaecentral | UAE Central |
| | uaenorth | UAE North |
| **Asia Pacific** | australiaeast | Australia East |
| | australiasoutheast | Australia Southeast |
| | centralindia | Central India |
| | southindia | South India |
| | westindia | West India |
| | eastasia | East Asia |
| | southeastasia | Southeast Asia |
| | japaneast | Japan East |
| | japanwest | Japan West |
| | koreacentral | Korea Central |
| | koreasouth | Korea South |


## Learn More

- [Getting Started](./quickstart.md)
- [Resiliency](./resiliency.md)
- [Security](./security.md)
- [Monitoring](./monitoring.md)
107 changes: 107 additions & 0 deletions website/docs/azure/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
id: faq
sidebar_label: FAQ
title: Frequently Asked Questions
---

# Frequently Asked Questions

## General Questions

### What is Azure Cosmos DB Garnet Cache?
Azure Cosmos DB Garnet Cache is a fully managed, high-performance caching service built on the Garnet remote cache-store from Microsoft Research. It provides Redis protocol compatibility with ultra-low latency and enterprise-grade security, scalability, and reliability.

### How can I access the Preview?
Azure Cosmos DB Garnet Cache is currently in an expanded Private Preview. Please [sign up](https://aka.ms/cosmos-db-garnet-preview) to join the preview.

### How is it different from self-hosted Garnet?
As a fully managed service, Azure Cosmos DB Garnet Cache handles infrastructure provisioning, scaling, patching, and monitoring automatically. Self-hosted Garnet requires you to manage the infrastructure, updates, and operations yourself.

### Does it only work with Azure Cosmos DB?
No, Azure Cosmos DB Garnet Cache can be used to accelerate data access for any application, including but not limited to use with Azure Cosmos DB. It doesn't use the Azure Cosmos DB SDKs or have any automatic syncing.

### Is it compatible with Redis clients?
Yes, Azure Cosmos DB Garnet Cache uses the Redis RESP protocol, making it compatible with existing Redis clients in all major programming languages without code changes.

### What Redis version is supported?
Azure Cosmos DB Garnet Cache supports the RESP protocol and doesn't have full support for any specific Redis version. Visit the list of [supported Redis commands](./api-compatibility.md).

### How is Azure Cosmos DB Garnet Cache priced?
Azure Cosmos DB Garnet Cache clusters are billed per instance per hour with no licensing fees. Each node will be billed for the chosen SKU plus an attached disk, used for [data persistence](./resiliency.md#data-persistence), with 4x the storage available for the SKU. Pricing per SKU is set at different rates than the underlying Azure VM and is subject to change between our extended Private Preview and Public Preview.

For information about pricing for specific SKUs, reach out to [[email protected]](mailto:[email protected]).


## Performance and Scalability

### What performance can I expect?
Latency is typically sub-millisecond and is around 3ms the 99th percentile. Performance and throughput varies by tier, key/ value size, and number of concurrent requests, among other factors.

### Can I scale my cache?
Yes, you can [scale out](./cluster-configuration.md#horizontal-scaling-scale-outin) by adding shards, or [scale up](./cluster-configuration.md#vertical-scaling-scale-updown) by changing SKU size within a VM family and generation with no downtime.

### How many connections are supported?
Garnet doesn't limit the number of client connections that can be made on any node for any SKU. In practice, connection limits vary by SKU. See the [virtual machine limits](https://learn.microsoft.com/azure/virtual-machines/sizes/overview#list-of-vm-size-families-by-type) corresponding to the [Azure Cosmos DB Garnet Cache SKU](./api-compatibility.md) you choose.


## Development and Integration

### Which client libraries are supported?
All Redis client libraries are supported. Ensure you visit the list of [supported commands](./api-compatibility.md). Popular libraries by language include:
- **C#**: [StackExchange.Redis](https://github.com/StackExchange/StackExchange.Redis)
- **Java**: [Jedis](https://github.com/redis/jedis), [Redisson](https://github.com/redisson/redisson)
- **Python**: [redis-py](https://github.com/redis/redis-py)
- **Node.js**: [node_redis](https://github.com/redis/node-redis)
- **Go**: [go-redis](https://github.com/redis/go-redis)

### Can I test it locally?
For local development, you can use the self-hosted Garnet server.


## Regional Availability

### Which regions is it available in?
Azure Cosmos DB Garnet Cache is available in many Azure regions. Check our [supported regions list](./cluster-configuration.md#regional-availability).

### Which regions support Availability Zones?
Azure Cosmos DB Garnet Cache can be configured with availability zones during provisioning in supported Azure regions where there is capacity for your chosen SKU. See the list of [Azure regions with availability zone support](https://learn.microsoft.com/azure/reliability/regions-list).


## Troubleshooting

### My application can't connect to the cache
Check these common issues:
1. **VNet configuration**: Verify your client application is in the [same virtual network](./security.md#network-security) as your cache
2. **Authentication**: Verify you've configured [role based access control](./security.md#authentication-and-access-control), which is required for data plane access
3. **SSL/TLS**: Ensure your client supports [TLS](./security.md#data-encryption)

### Cache performance is slower than expected
Common causes and solutions:
1. **Connection pooling**: Ensure proper connection pool configuration
2. **Hot keys**: Check for key access patterns causing bottlenecks
3. **Memory pressure**: [Monitor memory usage](./monitoring.md#high-memory-usage) and consider scaling up
4. **Network latency**: Test from the same region as your cache

### I'm getting timeout errors
Troubleshooting steps:
1. **Check timeout settings**: Verify client timeout configuration
2. **Monitor CPU/memory**: [High resource usage](./monitoring.md#high-cpu-usage) can cause timeouts
4. **Network issues**: Test network connectivity between client and cache


## Getting Help

### Where can I get technical support?
- [**Documentation**](./overview.md): For conceptual guidance and tutorials
- **Azure Support**: Create support tickets for technical issues
- **Community**: Stack Overflow and Azure forums

### How do I report bugs or request features?
- **Azure Support** for bugs and critical issues
- [**Feedback Form**](https://aka.ms/garnet-feedback) for feature requests or suggestions


## Next Steps

- [Garnet Overview](./overview.md)
- [Getting Started](./quickstart.md)
Loading
Loading