You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[!INCLUDE [SQL Server Azure SQL Database Synapse Analytics PDW FabricSQLDB](../../includes/applies-to-version/sql-asdb-asdbmi-asa-pdw-fabricsqldb.md)]
18
19
@@ -31,7 +32,7 @@ To perform a bulk load, you can use [bcp Utility](../../tools/bcp-utility.md), [
31
32
As the diagram suggests, a bulk load:
32
33
33
34
- Doesn't presort the data. Data is inserted into rowgroups in the order it's received.
34
-
- If the batch size is >= 102400, the rows are directly loaded into the compressed rowgroups. You should choose a batch size >=102400 for efficient bulk import, because you can avoid moving data rows to delta rowgroups before the rows are eventually moved to compressed rowgroups by a background thread, Tuple mover (TM).
35
+
- If the batch size is >= 102400, the rows are directly loaded into the compressed rowgroups. You should choose a batch size >=102400 for efficient bulk import, because you can avoid moving data rows to delta rowgroups before the rows are eventually moved to compressed rowgroups by a background thread, tuple mover (TM).
35
36
- If the batch size < 102,400 or if the remaining rows are < 102,400, the rows are loaded into delta rowgroups.
36
37
37
38
> [!NOTE]
@@ -41,16 +42,14 @@ Bulk loading has these built-in performance optimizations:
41
42
42
43
-**Parallel loads:** You can have multiple concurrent bulk loads (**bcp** or bulk insert) that are each loading a separate data file. Unlike rowstore bulk loads into [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], you don't need to specify `TABLOCK` because each bulk import thread loads data exclusively into separate rowgroups (compressed or delta rowgroups) with exclusive lock on it.
43
44
44
-
-**Reduced logging:** The data that is directly loaded into compressed row groups, leads to significant reduction in the size of the log. For example, if data was compressed 10x, the corresponding transaction log is roughly 10x smaller without requiring TABLOCK or Bulk-logged/Simple recovery model. Any data that goes to a delta rowgroup is fully logged. This includes any batch sizes that are less than 102,400 rows. Best practice is to use batchsize >= 102400. Since there's no TABLOCK required, you can load the data in parallel.
45
+
-**Reduced logging:** The data that is directly loaded into compressed row groups, leads to significant reduction in the size of the log. For example, if data was compressed 10x, the corresponding transaction log is roughly 10x smaller without requiring `TABLOCK` or Bulk-logged/Simple recovery model. Any data that goes to a delta rowgroup is fully logged. This includes any batch sizes that are less than 102,400 rows. Best practice is to use batchsize >= 102400. Since there's no `TABLOCK` required, you can load the data in parallel.
45
46
46
-
-**Minimal logging:** You can get further reduction in logging if you follow the prerequisites for [minimal logging](../import-export/prerequisites-for-minimal-logging-in-bulk-import.md). However, unlike loading data into a rowstore, TABLOCK leads to an X lock on the table rather than a BU (Bulk Update) lock and therefore parallel data load can't be done. For more information on locking, see [Locking and row versioning](../sql-server-transaction-locking-and-row-versioning-guide.md).
47
+
-**Minimal logging:** You can get further reduction in logging if you follow the prerequisites for [minimal logging](../import-export/prerequisites-for-minimal-logging-in-bulk-import.md). However, unlike loading data into a rowstore, `TABLOCK` leads to an `X` (exclusive) lock on the table rather than a `BU` (bulk update) lock and therefore parallel data load can't be done. For more information on locking, see [Locking and row versioning](../sql-server-transaction-locking-and-row-versioning-guide.md).
47
48
48
-
-**Locking optimization:** The X lock on a row group is automatically acquired when loading data into a compressed row group. However, when bulk loading into a delta rowgroup, an X lock is acquired at rowgroup but [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] still locks the PAGE/EXTENT because X rowgroup lock isn't part of locking hierarchy.
49
+
-**Locking optimization:** The `X` lock on a row group is automatically acquired when loading data into a compressed row group. However, when bulk loading into a delta rowgroup, an `X` lock is acquired for the rowgroup but [!INCLUDE [ssDE](../../includes/ssde-md.md)] still acquires page and extent locks because the `X` rowgroup lock isn't a part of the lock hierarchy.
49
50
50
51
If you have a nonclustered B-tree index on a columnstore index, there's no locking or logging optimization for the index itself but the optimizations on clustered columnstore index as described previously are applicable.
51
52
52
-
Data modification (insert, delete, update) isn't a batch mode operation because it's not parallel.
53
-
54
53
## Plan bulk load sizes to minimize delta rowgroups
55
54
56
55
Columnstore indexes perform best when most of the rows are compressed into the columnstore and not sitting in delta rowgroups. It's best to size your loads so that rows go directly to the columnstore and bypass the deltastore as much as possible.
@@ -96,8 +95,8 @@ FROM [<Staging Table>]
96
95
97
96
There are following optimizations available when loading into a clustered columnstore index from staging table:
98
97
99
-
-**Log Optimization:** Reduced logging when the data is loaded into a compressed rowgroup.
100
-
-**Locking Optimization:** When loading data into a compressed rowgroup, the X lock on rowgroup is acquired. However, with delta rowgroup, an X lock is acquired at rowgroup but [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] still locks the locks PAGE/EXTENT because X rowgroup lock isn't part of locking hierarchy.
98
+
-**Log optimization:** Reduced logging when the data is loaded into a compressed rowgroup.
99
+
-**Locking optimization:** When loading data into a compressed rowgroup, the `X` lock on rowgroup is acquired. However, when bulk loading into a delta rowgroup, an `X` lock is acquired for the rowgroup but [!INCLUDE [ssDE](../../includes/ssde-md.md)] still acquires page and extent locks because the `X` rowgroup lock isn't a part of the lock hierarchy.
101
100
102
101
If you have one or more nonclustered indexes, there's no locking or logging optimization for the index itself, but the optimizations on the clustered columnstore index as described previously are still there.
103
102
@@ -112,7 +111,7 @@ INSERT INTO [<table-name>] VALUES ('some value' /*replace with actual set of val
112
111
> [!NOTE]
113
112
> Concurrent threads using INSERT INTO to insert values into a clustered columnstore index can insert rows into the same deltastore rowgroup.
114
113
115
-
Once the rowgroup contains 1,048,576 rows, the delta rowgroup us marked closed but it's still available for queries and update/delete operations, but the newly inserted rows go into an existing or newly created deltastore rowgroup. There's a background thread *Tuple Mover (TM)* that compresses the closed delta rowgroups periodically every 5 minutes or so. You can explicitly invoke the following command to compress the closed delta rowgroup.
114
+
Once the rowgroup contains 1,048,576 rows, the delta rowgroup us marked closed but it's still available for queries and update/delete operations, but the newly inserted rows go into an existing or newly created deltastore rowgroup. There's a background thread called *tuple mover (TM)* that compresses the closed delta rowgroups periodically every 5 minutes or so. You can explicitly invoke the following command to compress the closed delta rowgroup.
116
115
117
116
```sql
118
117
ALTERINDEX [<index-name>] on [<table-name>] REORGANIZE
@@ -126,8 +125,8 @@ ALTER INDEX [<index-name>] on [<table-name>] REORGANIZE with (COMPRESS_ALL_ROW_G
126
125
127
126
## How loading into a partitioned table works
128
127
129
-
For partitioned data, [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] first assigns each row to a partition, and then performs columnstore operations on the data within the partition. Each partition has its own rowgroups and at least one delta rowgroup.
128
+
For partitioned data, the [!INCLUDE [ssDE](../../includes/ssde-md.md)] first assigns each row to a partition, and then performs columnstore operations on the data within the partition. Each partition has its own rowgroups and at least one delta rowgroup.
130
129
131
-
## Next steps
130
+
## Related content
132
131
133
-
-[Data Loading performance considerations with Clustered Columnstore indexes](https://techcommunity.microsoft.com/t5/DataCAT/Data-Loading-performance-considerations-with-Clustered/ba-p/305223)
132
+
-[Data Loading performance considerations with clustered columnstore indexes](https://techcommunity.microsoft.com/t5/DataCAT/Data-Loading-performance-considerations-with-Clustered/ba-p/305223)
0 commit comments