Skip to content

Commit 8f6d75e

Browse files
Update columnstore fragmentation
1 parent 2766f2b commit 8f6d75e

File tree

1 file changed

+18
-6
lines changed

1 file changed

+18
-6
lines changed

docs/relational-databases/indexes/reorganize-and-rebuild-indexes.md

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: This article describes index maintenance concepts, and a recommende
44
author: dimitri-furman
55
ms.author: dfurman
66
ms.reviewer: mikeray
7-
ms.date: 10/11/2024
7+
ms.date: 06/20/2025
88
ms.service: sql
99
ms.subservice: table-view-index
1010
ms.topic: how-to
@@ -32,11 +32,12 @@ helpviewer_keywords:
3232
- "clustered indexes, defragmenting"
3333
monikerRange: ">=sql-server-2016 || >=sql-server-linux-2017 || =azuresqldb-current || =azuresqldb-mi-current || >=aps-pdw-2016 || =fabric"
3434
---
35+
3536
# Optimize index maintenance to improve query performance and reduce resource consumption
3637

3738
[!INCLUDE [SQL Server Azure SQL Database PDW FabricSQLDB](../../includes/applies-to-version/sql-asdb-asdbmi-pdw-fabricsqldb.md)]
3839

39-
This article helps you decide when and how to perform index maintenance. It covers concepts such as index fragmentation and page density, and their impact on query performance and resource consumption. It describes index maintenance methods, [reorganizing an index](#reorganize-an-index) and [rebuilding an index](#rebuild-an-index), and suggests an index maintenance strategy that balances potential performance improvements against resource consumption required for maintenance.
40+
This article helps you decide when and how to perform index maintenance. It covers concepts such as index fragmentation and page density, and their impact on query performance and resource consumption. It describes index maintenance methods, [reorganizing an index](#reorganize-an-index) and [rebuilding an index](#rebuild-an-index), and suggests an index maintenance [strategy](#index-maintenance-strategy) that balances potential performance improvements against resource consumption required for maintenance.
4041

4142
> [!NOTE]
4243
> This article does not apply to a dedicated SQL pool in [!INCLUDE [ssazuresynapse-md](../../includes/ssazuresynapse-md.md)]. For information on index maintenance for a dedicated SQL pool in [!INCLUDE [ssazuresynapse-md](../../includes/ssazuresynapse-md.md)], see [Indexing dedicated SQL pool tables in [!INCLUDE [ssazuresynapse-md](../../includes/ssazuresynapse-md.md)]](/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-index).
@@ -88,10 +89,12 @@ The result set returned by `sys.dm_db_column_store_row_group_physical_stats` inc
8889
| `total_rows` | Number of rows physically stored in the row group. For compressed row groups, this includes the rows that are marked as deleted. |
8990
| `deleted_rows` | Number of rows physically stored in a compressed row group that are marked for deletion. 0 for row groups that are in delta store. |
9091

92+
To determine the total number of physically stored deleted rows for a nonclustered columnstore index, add the value in the `deleted_rows` column in `sys.dm_db_column_store_row_group_physical_stats` to the value in the `rows` column in [sys.internal_partitions](../system-catalog-views/sys-internal-partitions-transact-sql.md) for the internal object type `COLUMN_STORE_DELETE_BUFFER` and the same object, index, and partition.
93+
9194
Compressed row group fragmentation in a columnstore index can be computed using this formula:
9295

9396
```sql
94-
100.0*(ISNULL(deleted_rows,0))/NULLIF(total_rows,0)
97+
100.0 * (ISNULL(total_deleted_rows, 0)) / NULLIF(total_rows, 0)
9598
```
9699

97100
> [!TIP]
@@ -245,11 +248,11 @@ Microsoft recommends that customers consider and adopt the following index maint
245248

246249
In addition to the above considerations and strategy, in [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)] and [!INCLUDE [ssazuremi](../../includes/ssazuremi-md.md)] it is particularly important to consider the costs and benefits of index maintenance. Customers should perform it only when there is a demonstrated need, and taking into account the following points.
247250

248-
- [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)] and [!INCLUDE [ssazuremi](../../includes/ssazuremi-md.md)] implement [resource governance](/azure/azure-sql/database/resource-limits-logical-server#resource-governance) to set bounds on CPU, memory, and I/O consumption according to the provisioned pricing tier. These bounds apply to all user workloads, including index maintenance. If cumulative resource consumption by all workloads approaches resource bounds, the rebuild or reorganize operation can degrade performance of other workloads due to resource contention. For example, bulk data loads can become slower because transaction log I/O is at 100% due to a concurrent index rebuild. In [!INCLUDE [ssazuremi](../../includes/ssazuremi-md.md)], this impact can be reduced by running index maintenance in a separate Resource Governor workload group with restricted resource allocation, at the expense of extending index maintenance duration.
251+
- [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)] and [!INCLUDE [ssazuremi](../../includes/ssazuremi-md.md)] implement [resource governance](/azure/azure-sql/database/resource-limits-logical-server#resource-governance) to set bounds on CPU, memory, and I/O consumption according to the provisioned pricing tier. These bounds apply to all user workloads, including index maintenance. If cumulative resource consumption by all workloads approaches resource bounds, the rebuild or reorganize operation can degrade performance of other workloads due to resource contention. For example, bulk data loads can become slower because transaction log I/O is at 100% due to a concurrent index rebuild. In [!INCLUDE [ssazuremi](../../includes/ssazuremi-md.md)], this impact can be reduced by running index maintenance in a separate [resource governor](../resource-governor/resource-governor.md) workload group with restricted resource allocation, at the expense of extending index maintenance duration.
249252
- For cost savings, customers often provision databases, elastic pools, and managed instances with minimal resource headroom. The pricing tier is chosen to be sufficient for application workloads. To accommodate a significant increase in resource usage due to index maintenance without degrading application performance, customers might have to provision more resources and increase costs, without necessarily improving application performance.
250253
- In elastic pools, resources are shared across all databases in a pool. Even if a particular database is idle, performing index maintenance on that database can affect application workloads running concurrently in other databases in the same pool. For more information, see [Resource management in dense elastic pools](/azure/azure-sql/database/elastic-pool-resource-management).
251254
- For most types of storage used in [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)] and [!INCLUDE [ssazuremi](../../includes/ssazuremi-md.md)], there is no difference in performance between sequential I/O and random I/O. This reduces the impact of index fragmentation on query performance.
252-
- When using either [Read Scale-out](/azure/azure-sql/database/read-scale-out) or [Geo-replication](/azure/azure-sql/database/active-geo-replication-overview) replicas, data latency on replicas often increases while index maintenance is being performed on the primary replica. If a geo-replica is provisioned with insufficient resources to sustain an increase in transaction log generation caused by index maintenance, it can lag far behind the primary, causing the system to reseed it. That makes the replica unavailable until reseeding is complete. Additionally, in Premium and Business Critical service tiers, replicas used for high availability can similarly get far behind the primary during index maintenance. If a failover is required during or soon after index maintenance, it can take longer than expected.
255+
- When using either [Read Scale-out](/azure/azure-sql/database/read-scale-out) or [Geo-replication](/azure/azure-sql/database/active-geo-replication-overview), data latency on replicas often increases while index maintenance is being performed on the primary replica. If a geo-replica is provisioned with insufficient resources to sustain an increase in transaction log generation caused by index maintenance, it can lag far behind the primary, causing the system to reseed it. That makes the replica unavailable until reseeding is complete. Additionally, in Premium and Business Critical service tiers, replicas used for high availability can similarly get far behind the primary during index maintenance. If a failover is required during or soon after index maintenance, it can take longer than expected.
253256
- If an index rebuild runs on the primary replica, and a long-running query executes on a readable replica at the same time, the query can get automatically terminated to prevent blocking the redo thread on the replica.
254257

255258
There are specific but uncommon scenarios when one-time or periodic index maintenance may be needed in [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)] and [!INCLUDE [ssazuremi](../../includes/ssazuremi-md.md)]:
@@ -339,12 +342,21 @@ SELECT OBJECT_SCHEMA_NAME(i.object_id) AS schema_name,
339342
OBJECT_NAME(i.object_id) AS object_name,
340343
i.name AS index_name,
341344
i.type_desc AS index_type,
342-
100.0 * (ISNULL(SUM(rgs.deleted_rows), 0)) / NULLIF(SUM(rgs.total_rows), 0) AS avg_fragmentation_in_percent
345+
100.0 * (ISNULL(SUM(rgs.deleted_rows + ISNULL(ip.rows, 0)), 0)) / NULLIF(SUM(rgs.total_rows), 0) AS avg_fragmentation_in_percent
343346
FROM sys.indexes AS i
344347
INNER JOIN sys.dm_db_column_store_row_group_physical_stats AS rgs
345348
ON i.object_id = rgs.object_id
346349
AND
347350
i.index_id = rgs.index_id
351+
/* For nonclustered columnstore, include rows in the delete buffer */
352+
LEFT JOIN sys.internal_partitions AS ip
353+
ON i.object_id = ip.object_id
354+
AND
355+
i.index_id = ip.index_id
356+
AND
357+
rgs.partition_number = ip.partition_number
358+
AND
359+
ip.internal_object_type_desc = 'COLUMN_STORE_DELETE_BUFFER'
348360
WHERE rgs.state_desc = 'COMPRESSED'
349361
GROUP BY i.object_id, i.index_id, i.name, i.type_desc
350362
ORDER BY schema_name, object_name, index_name, index_type;

0 commit comments

Comments
 (0)