You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql/query-parquet-files.md
+6-5Lines changed: 6 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: azaricstefan
6
6
ms.service: azure-synapse-analytics
7
7
ms.topic: how-to
8
8
ms.subservice: sql
9
-
ms.date: 02/15/2023
9
+
ms.date: 12/10/2024
10
10
ms.author: stefanazaric
11
11
ms.reviewer: whhender
12
12
---
@@ -39,7 +39,7 @@ Make sure that you can access this file. If your file is protected with SAS key
39
39
> `ALTER DATABASE CURRENT COLLATE Latin1_General_100_BIN2_UTF8;`
40
40
> For more information on collations, see [Collation types supported for Synapse SQL](reference-collation-types.md).
41
41
42
-
If you use the `Latin1_General_100_BIN2_UTF8` collation you will get an additional performance boost compared to the other collations. The `Latin1_General_100_BIN2_UTF8` collation is compatible with parquet string sorting rules. The SQL pool is able to eliminate some parts of the parquet files that will not contain data needed in the queries (file/column-segment pruning). If you use other collations, all data from the parquet files will be loaded into Synapse SQL and the filtering is happening within the SQL process. The `Latin1_General_100_BIN2_UTF8` collation has additional performance optimization that works only for parquet and Cosmos DB. The downside is that you lose fine-grained comparison rules like case insensitivity.
42
+
If you use the `Latin1_General_100_BIN2_UTF8` collation you'll get an extra performance boost compared to the other collations. The `Latin1_General_100_BIN2_UTF8` collation is compatible with parquet string sorting rules. The SQL pool is able to eliminate some parts of the parquet files that won't contain data needed in the queries (file/column-segment pruning). If you use other collations, all data from the parquet files will be loaded into Synapse SQL, and the filtering is happening within the SQL process. The `Latin1_General_100_BIN2_UTF8` collation has another performance optimization that works only for parquet and Cosmos DB. The downside is that you lose fine-grained comparison rules like case insensitivity.
43
43
44
44
### Data source usage
45
45
@@ -121,7 +121,7 @@ ORDER BY
121
121
122
122
You don't need to use the OPENROWSET WITH clause when reading Parquet files. Column names and data types are automatically read from Parquet files.
123
123
124
-
Have in mind that if you are reading number of files at once, the schema, column names and data types will be inferred from the first file service gets from the storage. This can mean that some of the columns expected are omitted, all because the file used by the service to define the schema did not contain these columns. To explicitly specify the schema, please use OPENROWSET WITH clause.
124
+
Have in mind that if you're reading number of files at once, the schema, column names, and data types will be inferred from the first file service gets from the storage. This can mean that some of the columns expected are omitted, all because the file used by the service to define the schema didn't contain these columns. To explicitly specify the schema, use OPENROWSET WITH clause.
125
125
126
126
The following sample shows the automatic schema inference capabilities for Parquet files. It returns the number of rows in September 2018 without specifying a schema.
127
127
@@ -172,6 +172,7 @@ ORDER BY
172
172
173
173
For Parquet type mapping to SQL native type check [type mapping for Parquet](develop-openrowset.md#type-mapping-for-parquet).
174
174
175
-
## Next steps
175
+
## Next step
176
176
177
-
Advance to the next article to learn how to [Query Parquet nested types](query-parquet-nested-types.md).
177
+
> [!div class="nextstepaction"]
178
+
> [How to query Parquet nested types](query-parquet-nested-types.md)
0 commit comments