Skip to content

Commit 6c4de5b

Browse files
Merge pull request #246622 from xgerman/patch-6
Update best-practice-performance.md
2 parents 48cdffa + 8cc66ef commit 6c4de5b

File tree

1 file changed

+19
-6
lines changed

1 file changed

+19
-6
lines changed

articles/managed-instance-apache-cassandra/best-practice-performance.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ For more information refer to [Virtual Machine and disk performance](../virtual-
147147

148148
### Network performance
149149

150-
In most cases network performance is sufficient. However, if you are frequently streaming data (such as frequent horizontal scale-up/scale down) or there are huge ingress/egress data movements, this can become a problem. You may need to evaluate the network performance of your SKU. For example, the `Standard_DS14_v2` SKU supports 12,000 Mb/s, compare this to the byte-in/out in the metrics:
150+
In most cases network performance is sufficient. However, if you're frequently streaming data (such as frequent horizontal scale-up/scale down) or there are huge ingress/egress data movements, this can become a problem. You may need to evaluate the network performance of your SKU. For example, the `Standard_DS14_v2` SKU supports 12,000 Mb/s, compare this to the byte-in/out in the metrics:
151151

152152

153153
:::image type="content" source="./media/best-practice-performance/metrics-network.png" alt-text="Screenshot of network metrics." lightbox="./media/best-practice-performance/metrics-network.png" border="true":::
@@ -162,14 +162,14 @@ If you only see the network elevated for a small number of nodes, you might have
162162

163163
### Too many connected clients
164164

165-
Deployments should be planned and provisioned to support the maximum number of parallel requests required for the desired latency of an application. For a given deployment, introducing more load to the system above a minimum threshold increases overall latency. Monitor the number of connected clients to ensure this does not exceed tolerable limits.
165+
Deployments should be planned and provisioned to support the maximum number of parallel requests required for the desired latency of an application. For a given deployment, introducing more load to the system above a minimum threshold increases overall latency. Monitor the number of connected clients to ensure this doesn't exceed tolerable limits.
166166

167167
:::image type="content" source="./media/best-practice-performance/metrics-connections.png" alt-text="Screenshot of connected client metrics." lightbox="./media/best-practice-performance/metrics-connections.png" border="true":::
168168

169169

170170
### Disk space
171171

172-
In most cases, there is sufficient disk space as default deployments are optimized for IOPS, which leads to low utilization of the disk. Nevertheless, we advise occasionally reviewing disk space metrics. Cassandra accumulates a lot of disk and then reduces it when compaction is triggered. Hence it is important to review disk usage over longer periods to establish trends - like compaction unable to recoup space.
172+
In most cases, there's sufficient disk space as default deployments are optimized for IOPS, which leads to low utilization of the disk. Nevertheless, we advise occasionally reviewing disk space metrics. Cassandra accumulates a lot of disk and then reduces it when compaction is triggered. Hence it is important to review disk usage over longer periods to establish trends - like compaction unable to recoup space.
173173

174174
> [!NOTE]
175175
> In order to ensure available space for compaction, disk utilization should be kept to around 50%.
@@ -188,7 +188,7 @@ Our default formula assigns half the VM's memory to the JVM with an upper limit
188188

189189
In most cases memory gets reclaimed effectively by the Java garbage collector, but especially if the CPU is often above 80% there aren't enough CPU cycles for the garbage collector left. So any CPU performance problems should be addresses before memory problems.
190190

191-
If the CPU hovers below 70%, and the garbage collection isn't able to reclaim memory, you might need more JVM memory. This is especially the case if you are on a SKU with limited memory. In most cases, you will need to review your queries and client settings and reduce `fetch_size` along with what is chosen in `limit` within your CQL query.
191+
If the CPU hovers below 70%, and the garbage collection isn't able to reclaim memory, you might need more JVM memory. This is especially the case if you're on a SKU with limited memory. In most cases, you'll need to review your queries and client settings and reduce `fetch_size` along with what is chosen in `limit` within your CQL query.
192192

193193
If you indeed need more memory, you can:
194194

@@ -222,11 +222,24 @@ You might encounter this warning in the [CassandraLogs](monitor-clusters.md#crea
222222

223223
`Writing large partition <table> (105.426MiB) to sstable <file>`
224224

225-
This indicates a problem in the data model. Here is a [stack overflow article](https://stackoverflow.com/questions/74024443/how-do-i-analyse-and-solve-writing-large-partition-warnings-in-cassandra) that goes into more detail. This can cause severe performance issues and needs to be addressed.
225+
This indicates a problem in the data model. Here's a [stack overflow article](https://stackoverflow.com/questions/74024443/how-do-i-analyse-and-solve-writing-large-partition-warnings-in-cassandra) that goes into more detail. This can cause severe performance issues and needs to be addressed.
226+
227+
## Specialized optimizations
228+
### Compression
229+
Cassandra allows the selection of an appropriate compression algorithm when a table is created (see [Compression](https://cassandra.apache.org/doc/latest/cassandra/operating/compression.html)) The default is LZ4 which is excellent
230+
for throughput and CPU but consumes more space on disk. Using Zstd (Cassandra 4.0 and up) saves about ~12% space with
231+
minimal CPU overhead.
232+
233+
### Optimizing memtable heap space
234+
Our default is to use 1/4 of the JVM heap for [memtable_heap_space](https://cassandra.apache.org/doc/latest/cassandra/configuration/cass_yaml_file.html#memtable_heap_space)
235+
in the cassandra.yaml. For write oriented application and/or on SKUs with small memory
236+
this can lead to frequent flushing and fragmented sstables thus requiring more compaction.
237+
In such cases increasing it to at least 4048 might be beneficial but requires careful benchmarking
238+
to make sure other operations (e.g. reads) aren't affected.
226239

227240
## Next steps
228241

229242
In this article, we laid out some best practices for optimal performance. You can now start working with the cluster:
230243

231244
> [!div class="nextstepaction"]
232-
> [Create a cluster using Azure Portal](create-cluster-portal.md)
245+
> [Create a cluster using Azure Portal](create-cluster-portal.md)

0 commit comments

Comments
 (0)