Skip to content

Commit bb6a7f2

Browse files
20230215 1008
1 parent ff40fbf commit bb6a7f2

File tree

5 files changed

+17
-12
lines changed

5 files changed

+17
-12
lines changed

articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-reference-collation-types.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,5 @@ When passed 'Collation' as the property parameter, the DatabasePropertyEx functi
5656

5757
Additional information on best practices for dedicated SQL pool and serverless SQL pool can be found in the following articles:
5858

59-
- [Best Practices for dedicated SQL pool](./best-practices-dedicated-sql-pool.md)
60-
- [Best practices for serverless SQL pool](./best-practices-serverless-sql-pool.md)
61-
59+
- [Best Practices for dedicated SQL pool](../sql/best-practices-dedicated-sql-pool.md)
60+
- [Best practices for serverless SQL pool](../sql/best-practices-serverless-sql-pool.md)

articles/synapse-analytics/sql/develop-tables-external-tables.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ The key differences between Hadoop and native external tables are presented in t
2929
| [File elimination](#file-elimination) (predicate pushdown) | No | Yes in serverless SQL pool. For the string pushdown, you need to use `Latin1_General_100_BIN2_UTF8` collation on the `VARCHAR` columns to enable pushdown. For more information on collations, refer to [Collation types supported for Synapse SQL](reference-collation-types.md).|
3030
| Custom format for location | No | Yes, using wildcards like `/year=*/month=*/day=*` for Parquet or CSV formats. Custom folder paths are not available in Delta Lake. In the serverless SQL pool you can also use recursive wildcards `/logs/**` to reference Parquet or CSV files in any sub-folder beneath the referenced folder. |
3131
| Recursive folder scan | Yes | Yes. In serverless SQL pools must be specified `/**` at the end of the location path. In Dedicated pool the folders are always scanned recursively. |
32-
| Storage authentication | Storage Access Key(SAK), AAD passthrough, Managed identity, Custom application Azure AD identity | [Shared Access Signature(SAS)](develop-storage-files-storage-access-control.md?tabs=shared-access-signature), [AAD passthrough](develop-storage-files-storage-access-control.md?tabs=user-identity), [Managed identity](develop-storage-files-storage-access-control.md?tabs=managed-identity), [Custom application Azure AD identity](develop-storage-files-storage-access-control.md?tabs=service-principal). |
32+
| Storage authentication | Storage Access Key(SAK), Azure Active Directory passthrough, Managed identity, custom application Azure Active Directory identity | [Shared Access Signature(SAS)](develop-storage-files-storage-access-control.md?tabs=shared-access-signature), [Azure Active Directory passthrough](develop-storage-files-storage-access-control.md?tabs=user-identity), [Managed identity](develop-storage-files-storage-access-control.md?tabs=managed-identity), [Custom application Azure AD identity](develop-storage-files-storage-access-control.md?tabs=service-principal). |
3333
| Column mapping | Ordinal - the columns in the external table definition are mapped to the columns in the underlying Parquet files by position. | Serverless pool: by name. The columns in the external table definition are mapped to the columns in the underlying Parquet files by column name matching. <br/> Dedicated pool: ordinal matching. The columns in the external table definition are mapped to the columns in the underlying Parquet files by position.|
3434
| CETAS (exporting/transformation) | Yes | CETAS with the native tables as a target works only in the serverless SQL pool. You cannot use the dedicated SQL pools to export data using native tables. |
3535

@@ -269,7 +269,7 @@ If you're retrieving data from the text file, store each missing value by using
269269

270270
- 0 if the column is defined as a numeric column. Decimal columns aren't supported and will cause an error.
271271
- Empty string ("") if the column is a string column.
272-
- 1900-01-01 if the column is a date column.
272+
- "1900-01-01" if the column is a date column.
273273

274274
FALSE -
275275
Store all missing values as NULL. Any NULL values that are stored by using the word NULL in the delimited text file are imported as the string 'NULL'.

articles/synapse-analytics/sql/query-delta-lake-format.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: synapse analytics
55
ms.service: synapse-analytics
66
ms.topic: how-to
77
ms.subservice: sql
8-
ms.date: 12/06/2022
8+
ms.date: 02/15/2023
99
author: jovanpop-msft
1010
ms.author: jovanpop
1111
ms.reviewer: sngun, wiassaf
@@ -24,7 +24,7 @@ A serverless SQL pool can read Delta Lake files that are created using Apache Sp
2424
Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
2525

2626
> [!IMPORTANT]
27-
> Querying Delta Lake format using the serverless SQL pool is **Generally available** functionality. However, querying Spark Delta tables is still in public preview and not production ready. There are known issues that might happen if you query Delta tables created using the Spark pools. See the known issues in the [self-help page](resources-self-help-sql-on-demand.md#delta-lake).
27+
> Querying Delta Lake format using the serverless SQL pool is **Generally available** functionality. However, querying Spark Delta tables is still in public preview and not production ready. There are known issues that might happen if you query Delta tables created using the Spark pools. See the known issues in [Serverless SQL pool self-help](resources-self-help-sql-on-demand.md#delta-lake).
2828
2929
## Quickstart example
3030

@@ -68,7 +68,8 @@ Make sure you can access your file. If your file is protected with SAS key or cu
6868
> Ensure you are using a UTF-8 database collation (for example `Latin1_General_100_BIN2_UTF8`) because string values in Delta Lake files are encoded using UTF-8 encoding.
6969
> A mismatch between the text encoding in the Delta Lake file and the collation may cause unexpected conversion errors.
7070
> You can easily change the default collation of the current database using the following T-SQL statement:
71-
> `alter database current collate Latin1_General_100_BIN2_UTF8`
71+
> `ALTER DATABASE CURRENT COLLATE Latin1_General_100_BIN2_UTF8;`
72+
> For more information on collations, see [Collation types supported for Synapse SQL](reference-collation-types.md).
7273
7374
### Data source usage
7475

@@ -132,7 +133,9 @@ With the explicit specification of the result set schema, you can minimize the t
132133

133134

134135
### Query partitioned data
136+
135137
The data set provided in this sample is divided (partitioned) into separate subfolders.
138+
136139
Unlike [Parquet](query-parquet-files.md), you don't need to target specific partitions using the `FILEPATH` function. The `OPENROWSET` will identify partitioning
137140
columns in your Delta Lake folder structure and enable you to directly query data using these columns. This example shows fare amounts by year, month, and payment_type for the first three months of 2017.
138141

articles/synapse-analytics/sql/query-parquet-files.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: azaricstefan
66
ms.service: synapse-analytics
77
ms.topic: how-to
88
ms.subservice: sql
9-
ms.date: 05/20/2020
9+
ms.date: 02/15/2023
1010
ms.author: stefanazaric
1111
ms.reviewer: sngun
1212
---
@@ -36,7 +36,8 @@ Make sure that you can access this file. If your file is protected with SAS key
3636
> Ensure you are using a UTF-8 database collation (for example `Latin1_General_100_BIN2_UTF8`) because string values in PARQUET files are encoded using UTF-8 encoding.
3737
> A mismatch between the text encoding in the PARQUET file and the collation may cause unexpected conversion errors.
3838
> You can easily change the default collation of the current database using the following T-SQL statement:
39-
> `alter database current collate Latin1_General_100_BIN2_UTF8`'
39+
> `ALTER DATABASE CURRENT COLLATE Latin1_General_100_BIN2_UTF8;`
40+
> For more information on collations, see [Collation types supported for Synapse SQL](reference-collation-types.md).
4041
4142
If you use the `Latin1_General_100_BIN2_UTF8` collation you will get an additional performance boost compared to the other collations. The `Latin1_General_100_BIN2_UTF8` collation is compatible with parquet string sorting rules. The SQL pool is able to eliminate some parts of the parquet files that will not contain data needed in the queries (file/column-segment pruning). If you use other collations, all data from the parquet files will be loaded into Synapse SQL and the filtering is happening within the SQL process. The `Latin1_General_100_BIN2_UTF8` collation has additional performance optimization that works only for parquet and CosmosDB. The downside is that you lose fine-grained comparison rules like case insensitivity.
4243

@@ -75,9 +76,10 @@ from openrowset(
7576
> Make sure that you are explicilty specifying some UTF-8 collation (for example `Latin1_General_100_BIN2_UTF8`) for all string columns in `WITH` clause or set some UTF-8 collation at database level.
7677
> Mismatch between text encoding in the file and string column collation might cause unexpected conversion errors.
7778
> You can easily change default collation of the current database using the following T-SQL statement:
78-
> `alter database current collate Latin1_General_100_BIN2_UTF8`
79-
> You can easily set collation on the colum types using the following definition:
79+
> `ALTER DATABASE CURRENT COLLATE Latin1_General_100_BIN2_UTF8;`
80+
> You can easily set collation on the colum types, for example:
8081
> `geo_id varchar(6) collate Latin1_General_100_BIN2_UTF8`
82+
> For more information on collations, see [Collation types supported for Synapse SQL](../sql/reference-collation-types.md).
8183
8284
In the following sections you can see how to query various types of PARQUET files.
8385

articles/synapse-analytics/sql/reference-collation-types.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ The following table shows which collation types are supported by which service.
4444

4545

4646
## Check the current collation
47+
4748
To check the current collation for the database, you can run the following T-SQL snippet:
4849

4950
```sql

0 commit comments

Comments
 (0)