Skip to content

Commit 5709546

Browse files
Merge pull request #213719 from y0nil/patch-1
Update how-to-tsi-gen2-migration.md
2 parents bbc5c59 + 6c88b2d commit 5709546

File tree

1 file changed

+5
-39
lines changed

1 file changed

+5
-39
lines changed

articles/time-series-insights/how-to-tsi-gen2-migration.md

Lines changed: 5 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -70,45 +70,11 @@ Data
7070
:::image type="content" source="media/gen2-migration/adx-log-analytics.png" alt-text="Screenshot of the Azure Data Explorer Log Analytics Workspace" lightbox="media/gen2-migration/adx-log-analytics.png":::
7171

7272
1. Data partitioning.
73-
1. For small size data, the default ADX partitioning is enough. For more complex scenario, with large datasets and right push rate custom ADX data partitioning is more appropriate. Data partitioning is beneficial for scenarios, as follows:
74-
1. Improving query latency in big data sets.
75-
1. When querying historical data.
76-
1. When ingesting out-of-order data.
77-
1. The custom data partitioning should include:
78-
1. The timestamp column, which results in time-based partitioning of extents.
79-
1. A string-based column, which corresponds to the Time Series ID with highest cardinality.
80-
1. An example of data partitioning containing a Time Series ID column and a timestamp column is:
81-
82-
```
83-
.alter table events policy partitioning
84-
{
85-
"PartitionKeys": [
86-
{
87-
"ColumnName": "timeSeriesId",
88-
"Kind": "Hash",
89-
"Properties": {
90-
"Function": "XxHash64",
91-
"MaxPartitionCount": 32,
92-
"PartitionAssignmentMode": "Uniform"
93-
}
94-
},
95-
{
96-
"ColumnName": "timestamp",
97-
"Kind": "UniformRange",
98-
"Properties": {
99-
"Reference": "1970-01-01T00:00:00",
100-
"RangeSize": "1.00:00:00",
101-
"OverrideCreationTime": true
102-
}
103-
}
104-
] ,
105-
"EffectiveDateTime": "1970-01-01T00:00:00",
106-
"MinRowCountPerOperation": 0,
107-
"MaxRowCountPerOperation": 0,
108-
"MaxOriginalSizePerOperation": 0
109-
}
110-
```
111-
For more references, check [ADX Data Partitioning Policy](/azure/data-explorer/kusto/management/partitioningpolicy).
73+
1. For most data sets, the default ADX partitioning is enough.
74+
1. Data partitioning is beneficial in a very specific set of scenarios, and shouldn't be applied otherwise:
75+
1. Improving query latency in big data sets where most queries filter on a high cardinality string column, e.g. a time-series ID.
76+
1. When ingesting out-of-order data, e.g. when events from the past may be ingested days or weeks after their generation in the origin.
77+
1. For more information, check [ADX Data Partitioning Policy](/azure/data-explorer/kusto/management/partitioningpolicy).
11278

11379
#### Prepare for Data Ingestion
11480

0 commit comments

Comments
 (0)