Skip to content

Commit fd3cb56

Browse files
Apply suggestions from code review
Co-authored-by: William Assaf MSFT <[email protected]>
1 parent 36a2ff1 commit fd3cb56

File tree

3 files changed

+10
-9
lines changed

3 files changed

+10
-9
lines changed

docs/relational-databases/indexes/columnstore-indexes-data-loading-guidance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ ALTER INDEX [<index-name>] on [<table-name>] REORGANIZE with (COMPRESS_ALL_ROW_G
125125

126126
## How loading into a partitioned table works
127127

128-
For partitioned data, [!INCLUDE [ssDE](../../includes/ssde-md.md)] first assigns each row to a partition, and then performs columnstore operations on the data within the partition. Each partition has its own rowgroups and at least one delta rowgroup.
128+
For partitioned data, the [!INCLUDE [ssDE](../../includes/ssde-md.md)] first assigns each row to a partition, and then performs columnstore operations on the data within the partition. Each partition has its own rowgroups and at least one delta rowgroup.
129129

130130
## Related content
131131

docs/relational-databases/indexes/columnstore-indexes-data-warehouse.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Columnstore indexes, in conjunction with partitioning, are essential for buildin
2222

2323
[!INCLUDE [sssql16-md](../../includes/sssql16-md.md)] introduced these features for columnstore performance enhancements:
2424

25-
- Always On supports querying a columnstore index on a readable secondary replica.
25+
- Always On availability groups support querying a columnstore index on a readable secondary replica.
2626
- Multiple Active Result Sets (MARS) supports columnstore indexes.
2727
- A new dynamic management view [sys.dm_db_column_store_row_group_physical_stats (Transact-SQL)](../system-dynamic-management-views/sys-dm-db-column-store-row-group-physical-stats-transact-sql.md) provides performance troubleshooting information at the row group level.
2828
- Serial queries on columnstore indexes can run in batch mode. Previously, only parallel queries could run in batch mode.
@@ -66,7 +66,7 @@ CREATE UNIQUE INDEX taccount_nc1 ON t_account (AccountKey);
6666

6767
By design, a columnstore table doesn't allow a clustered primary key constraint. Now you can use a nonclustered index on a columnstore table to enforce uniqueness. A primary key is equivalent to a `UNIQUE` constraint on a non-NULL column, and [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] implements a `UNIQUE` constraint as a nonclustered index. Combining these facts, the following example defines a `UNIQUE` constraint on the non-NULL column `AccountKey`. The result is a nonclustered index that enforces uniquness on a non-NULL column.
6868

69-
Next, the table is converted to a clustered columnstore index. During the conversion, the nonclustered index persists. The result is a clustered columnstore index with a nonclustered index that enforces uniqueness. Since any update or insert on the columnstore table also affects the nonclustered index, all operations that violate the unique constraint and the non-NULL constraint cause the entire operation to fail.
69+
Next, the table is converted to a clustered columnstore index. During the conversion, the nonclustered index persists. The result is a clustered columnstore index with a nonclustered index that enforces uniqueness. Since any update or insert on the columnstore table also affects the nonclustered index, all operations that violate the unique constraint and the non-`NULL` constraint cause the entire operation to fail.
7070

7171
The result is a columnstore index with a nonclustered index that enforces uniqueness on both indexes.
7272

docs/relational-databases/indexes/get-started-with-columnstore-for-real-time-operational-analytics.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -114,9 +114,9 @@ The Data Exposed video series into more details on some of the capabilities and
114114

115115
Running real-time operational analytics can impact the performance of the OLTP workload. This impact should be minimal. [Example A](#example-a-access-hot-data-from-b-tree-index-warm-data-from-columnstore-index) shows how to use filtered indexes to minimize impact of nonclustered columnstore index on transactional workload while still delivering analytics in real time.
116116

117-
To minimize the overhead of maintaining a nonclustered columnstore index on an operational workload, you can use a filtered condition to create a nonclustered columnstore index only on the *warm* or slowly changing data. For example, in an order management application, you can create a nonclustered columnstore index on the orders that have already been shipped. Once the order has shipped, it rarely changes and therefore can be considered warm data. With a filtered index, the data in nonclustered columnstore index requires fewer updates thereby lowering the impact on transactional workload.
117+
To minimize the overhead of maintaining a nonclustered columnstore index on an operational workload, you can use a filtered condition to create a nonclustered columnstore index only on the *warm* or slowly changing data. For example, in an order management application, you can create a nonclustered columnstore index on the orders that have already been shipped. Once the order has shipped, it rarely changes and therefore can be considered warm data. With a filtered index, the data in nonclustered columnstore index requires fewer updates thereby lowering the impact on transactional workload.
118118

119-
Analytics queries transparently access both warm and hot data as needed to provide real-time analytics. If a significant part of the operational workload is touching the 'hot' data, those operations don't require additional maintenance of the columnstore index. A best practice is to have a rowstore clustered index on the column(s) used in the filtered index definition. [!INCLUDE [ssDE-md](../../includes/ssde-md.md)] uses the clustered index to quickly scan the rows that did not meet the filtered condition. Without this clustered index, a full table scan of the rowstore table is required to find these rows, which can negatively impact the performance of analytical queries. In the absence of clustered index, you could create a complementary filtered nonclustered B-tree index to identify such rows but it is not recommended because accessing large range of rows through nonclustered B-tree indexes is expensive.
119+
Analytics queries transparently access both warm and hot data as needed to provide real-time analytics. If a significant part of the operational workload is touching the 'hot' data, those operations don't require additional maintenance of the columnstore index. A best practice is to have a rowstore clustered index on the column(s) used in the filtered index definition. The [!INCLUDE [ssDE-md](../../includes/ssde-md.md)] uses the clustered index to quickly scan the rows that did not meet the filtered condition. Without this clustered index, a full table scan of the rowstore table is required to find these rows, which can negatively impact the performance of analytical queries. In the absence of clustered index, you could create a complementary filtered nonclustered B-tree index to identify such rows but it is not recommended because accessing large range of rows through nonclustered B-tree indexes is expensive.
120120
121121
> [!NOTE]
122122
> A filtered nonclustered columnstore index is only supported on disk-based tables. It is not supported on memory-optimized tables.
@@ -193,8 +193,9 @@ accounttype nvarchar(50),
193193
accountCodeAlternatekey int
194194
);
195195
196-
-- Creating nonclustered columnstore index with COMPRESSION_DELAY. The columnstore index will keep the rows in closed delta rowgroup for 100 minutes
197-
-- after it has been marked closed.
196+
-- Creating nonclustered columnstore index with COMPRESSION_DELAY.
197+
-- The columnstore index will keep the rows in closed delta rowgroup
198+
-- for 100 minutes after it has been marked closed.
198199
CREATE NONCLUSTERED COLUMNSTORE INDEX t_colstor_cci ON t_colstor
199200
(accountkey, accountdescription, accounttype)
200201
WITH (DATA_COMPRESSION = COLUMNSTORE, COMPRESSION_DELAY = 100);
@@ -205,7 +206,7 @@ For more information, see [Blog: Compression delay](/archive/blogs/sqlserversto
205206
Here are the recommended best practices:
206207

207208
- **Insert/Query workload:** If your workload is primarily inserting data and querying it, the default `COMPRESSION_DELAY` of 0 is the recommended option. The newly inserted rows will get compressed once 1 million rows have been inserted into a single delta rowgroup.
208-
Some examples of such workload are (a) traditional DW workload (b) select-stream analysis when you need to analyze the select pattern in a web application.
209+
Some examples of such workloads are a traditional DW workload or a select-stream analysis when you need to analyze the select pattern in a web application.
209210

210211
- **OLTP workload:** If the workload is DML heavy (that is, heavy mix of Update, Delete and Insert), you might see columnstore index fragmentation by examining the DMV `sys.dm_db_column_store_row_group_physical_stats`. If you see that > 10% rows are marked deleted in recently compressed rowgroups, you can use `COMPRESSION_DELAY` option to add time delay when rows become eligible for compression. For example, if for your workload, the newly inserted stays 'hot' (that is, gets updated multiple times) for say 60 minutes, you should choose `COMPRESSION_DELAY` to be 60.
211212

@@ -226,7 +227,7 @@ WHERE object_id = object_id('FactOnlineSales2')
226227
ORDER BY created_time DESC;
227228
```
228229

229-
If the number of deleted rows in compressed rowgroups > 20%, plateauing in older rowgroups with < 5% variation (referred to as cold rowgroups) set `COMPRESSION_DELAY` = (youngest_rowgroup_created_time - current_time). This approach works best with a stable and relatively homogeneous workload.
230+
If the number of deleted rows in compressed rowgroups > 20%, plateauing in older rowgroups with < 5% variation (referred to as cold rowgroups), then set `COMPRESSION_DELAY` = (youngest_rowgroup_created_time - current_time). This approach works best with a stable and relatively homogeneous workload.
230231

231232
## Related content
232233

0 commit comments

Comments
 (0)