You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql/develop-tables-statistics.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -572,13 +572,13 @@ Serverless SQL pool analyzes incoming user queries for missing statistics. If st
572
572
The SELECT statement will trigger automatic creation of statistics.
573
573
574
574
> [!NOTE]
575
-
> Automatic creation of statistics is turned on for Parquet files. For CSV files, statistics will be automatically created if you use OPENROWSET. You need to create statistics manually you use CSV external tables.
575
+
> For automatic creation of statistics sampling is used and in most cases sampling percentage will be less than 100%. This flow is the same for every file format. Have in mind that when reading CSV with parser version 1.0 sampling is not supported and automatic creation of statistics will not happen with sampling percentage less than 100%. For small tables with estimated low cardinality (number of rows) automatic statistics creation will be triggered with sampling percentage of 100%. That basically means that fullscan is triggered and automatic statistics are created even for CSV with parser version 1.0.
576
576
577
577
Automatic creation of statistics is done synchronously so you may incur slightly degraded query performance if your columns are missing statistics. The time to create statistics for a single column depends on the size of the files targeted.
578
578
579
579
### Manual creation of statistics
580
580
581
-
Serverless SQL pool lets you create statistics manually. For CSV external tables, you have to create statistics manually because automatic creation of statistics isn't turned on for CSV external tables.
581
+
Serverless SQL pool lets you create statistics manually. In case you are using parser version 1.0 with CSV, you will probably have to create statistics manually, because this parser version does not support sampling. Automatic creation of statistics in case of parser version 1.0 will not happen, unless the sampling percent is 100%.
582
582
583
583
See the following examples for instructions on how to manually create statistics.
584
584
@@ -593,7 +593,7 @@ When statistics are stale, new ones will be created. The algorithm goes through
593
593
Manual stats are never declared stale.
594
594
595
595
> [!NOTE]
596
-
> Automatic recreation of statistics is turned on for Parquet files. For CSV files, statistics will be recreated if you use OPENROWSET. You need to drop and create statistics manually for CSV external tables. Check the examples below on how to drop and create statistics.
596
+
> For automatic recreation of statistics sampling is used and in most cases sampling percentage will be less than 100%. This flow is the same for every file format. Have in mind that when reading CSV with parser version 1.0 sampling is not supported and automatic recreation of statistics will not happen with sampling percentage less than 100%. In that case you need to drop and recreate statistics manually. Check the examples below on how to drop and create statistics. For small tables with estimated low cardinality (number of rows) automatic statistics recreation will be triggered with sampling percentage of 100%. That basically means that fullscan is triggered and automatic statistics are created even for CSV with parser version 1.0.
597
597
598
598
One of the first questions to ask when you're troubleshooting a query is, **"Are the statistics up to date?"**
599
599
@@ -639,7 +639,7 @@ Specifies a Transact-SQL statement that will return column values to be used for
639
639
```
640
640
641
641
> [!NOTE]
642
-
> CSV sampling does not work at this time, only FULLSCAN is supported for CSV.
642
+
> CSV sampling does not work if you are using parser version 1.0, only FULLSCAN is supported for CSV with parser version 1.0.
643
643
644
644
#### Create single-column statistics by examining every row
645
645
@@ -767,7 +767,7 @@ Specifies the approximate percentage or number of rows in the table or indexed v
767
767
SAMPLE can't be used with the FULLSCAN option.
768
768
769
769
> [!NOTE]
770
-
> CSV sampling does not work at this time, only FULLSCAN is supported for CSV.
770
+
> CSV sampling does not work if you are using parser version 1.0, only FULLSCAN is supported for CSV with parser version 1.0.
771
771
772
772
#### Create single-column statistics by examining every row
0 commit comments