Skip to content

Commit f83f1d3

Browse files
committed
Freshness and formatting
1 parent 132a5f3 commit f83f1d3

File tree

1 file changed

+7
-9
lines changed

1 file changed

+7
-9
lines changed

articles/synapse-analytics/sql/query-delta-lake-format.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: synapse analytics
55
ms.service: azure-synapse-analytics
66
ms.topic: how-to
77
ms.subservice: sql
8-
ms.date: 02/15/2023
8+
ms.date: 12/17/2024
99
author: jovanpop-msft
1010
ms.author: jovanpop
1111
ms.reviewer: whhender, wiassaf
@@ -23,7 +23,7 @@ You can learn more from the [how to query delta lake tables video](https://www.y
2323
The serverless SQL pool in Synapse workspace enables you to read the data stored in Delta Lake format, and serve it to reporting tools.
2424
A serverless SQL pool can read Delta Lake files that are created using Apache Spark, Azure Databricks, or any other producer of the Delta Lake format.
2525

26-
Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
26+
Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
2727

2828
> [!IMPORTANT]
2929
> Querying Delta Lake format using the serverless SQL pool is **Generally available** functionality. However, querying Spark Delta tables is still in public preview and not production ready. There are known issues that might happen if you query Delta tables created using the Spark pools. See the known issues in [Serverless SQL pool self-help](resources-self-help-sql-on-demand.md#delta-lake).
@@ -50,7 +50,7 @@ The URI in the `OPENROWSET` function must reference the root Delta Lake folder t
5050
> [!div class="mx-imgBorder"]
5151
>![ECDC COVID-19 Delta Lake folder](./media/shared/covid-delta-lake-studio.png)
5252
53-
If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
53+
If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
5454

5555
```python
5656
%%pyspark
@@ -64,7 +64,7 @@ To improve the performance of your queries, consider specifying explicit types i
6464
> The serverless Synapse SQL pool uses schema inference to automatically determine columns and their types. The rules for schema inference are the same used for Parquet files.
6565
> For Delta Lake type mapping to SQL native type check [type mapping for Parquet](develop-openrowset.md#type-mapping-for-parquet).
6666
67-
Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you will need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
67+
Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you'll need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
6868

6969
> [!IMPORTANT]
7070
> Ensure you are using a UTF-8 database collation (for example `Latin1_General_100_BIN2_UTF8`) because string values in Delta Lake files are encoded using UTF-8 encoding.
@@ -80,7 +80,7 @@ The previous examples used the full path to the file. As an alternative, you can
8080
> [!IMPORTANT]
8181
> Data sources can be created only in custom databases (not in the master database or the databases replicated from Apache Spark pools).
8282
83-
To use the samples below, you will need to complete the following step:
83+
To use the samples below, you'll need to complete the following step:
8484
1. **Create a database** with a datasource that references [NYC Yellow Taxi](https://azure.microsoft.com/services/open-datasets/catalog/nyc-taxi-limousine-commission-yellow-taxi-trip-records/) storage account.
8585
1. Initialize the objects by executing [setup script](https://github.com/Azure-Samples/Synapse/blob/master/SQL/Samples/LdwSample/SampleDB.sql) on the database you created in step 1. This setup script will create the data sources, database scoped credentials, and external file formats that are used in these samples.
8686

@@ -172,7 +172,7 @@ The folder name in the `OPENROWSET` function (`yellow` in this example) is conca
172172
> [!div class="mx-imgBorder"]
173173
>![Yellow Taxi Delta Lake folder](./media/shared/yellow-taxi-delta-lake.png)
174174
175-
If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
175+
If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
176176

177177
```python
178178
%%pyspark
@@ -186,13 +186,11 @@ The second argument of `DeltaTable.convertToDeltaLake` function represents the p
186186

187187
- Review the limitations and the known issues on [Synapse serverless SQL pool self-help page](resources-self-help-sql-on-demand.md#delta-lake).
188188

189-
## Next steps
189+
## Related content
190190

191191
Advance to the next article to learn how to [Query Parquet nested types](query-parquet-nested-types.md).
192192
If you want to continue building Delta Lake solution, learn how to create [views](create-use-views.md#delta-lake-views) or [external tables](create-use-external-tables.md#delta-lake-external-table) on the Delta Lake folder.
193193

194-
## See also
195-
196194
- [What is Delta Lake](../spark/apache-spark-what-is-delta-lake.md)
197195
- [Learn how to use Delta Lake in Apache Spark pools for Azure Synapse Analytics](../spark/apache-spark-delta-lake-overview.md)
198196
- [Azure Databricks Delta Lake best practices](/azure/databricks/delta/best-practices)

0 commit comments

Comments
 (0)