Skip to content

Commit f60bde7

Browse files
Merge pull request #292092 from whhender/december-synapse-freshness
December synapse freshness part 2
2 parents fc75456 + 1461229 commit f60bde7

File tree

5 files changed

+43
-44
lines changed

5 files changed

+43
-44
lines changed

articles/synapse-analytics/get-started-analyze-storage.md

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,21 +7,19 @@ ms.reviewer: whhender
77
ms.service: azure-synapse-analytics
88
ms.subservice: workspace
99
ms.topic: tutorial
10-
ms.date: 11/18/2022
10+
ms.date: 12/17/2024
1111
---
1212

13-
# Analyze data in a storage account
13+
# Tutorial: Analyze data in a storage account
1414

1515
In this tutorial, you'll learn how to analyze data located in a storage account.
1616

17-
## Overview
18-
1917
So far, we've covered scenarios where data resides in databases in the workspace. Now we'll show you how to work with files in storage accounts. In this scenario, we'll use the primary storage account of the workspace and container that we specified when creating the workspace.
2018

2119
* The name of the storage account: **contosolake**
2220
* The name of the container in the storage account: **users**
2321

24-
### Create CSV and Parquet files in your storage account
22+
## Create CSV and Parquet files in your storage account
2523

2624
Run the following code in a notebook in a new code cell. It creates a CSV file and a parquet file in the storage account.
2725

@@ -36,7 +34,7 @@ df.write.mode("overwrite").csv("/NYCTaxi/PassengerCountStats_csvformat")
3634
df.write.mode("overwrite").parquet("/NYCTaxi/PassengerCountStats_parquetformat")
3735
```
3836

39-
### Analyze data in a storage account
37+
## Analyze data in a storage account
4038

4139
You can analyze the data in your workspace default Azure Data Lake Storage (ADLS) Gen2 account or you can link an ADLS Gen2 or Blob storage account to your workspace through "**Manage**" > "**Linked Services**" > "**New**" (The next steps will refer to the primary ADLS Gen2 account).
4240

@@ -69,9 +67,7 @@ You can analyze the data in your workspace default Azure Data Lake Storage (ADLS
6967

7068
1. Run the script.
7169

72-
73-
74-
## Next steps
70+
## Next step
7571

7672
> [!div class="nextstepaction"]
7773
> [Orchestrate activities with pipelines](get-started-pipelines.md)

articles/synapse-analytics/get-started-visualize-power-bi.md

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ description: In this tutorial, you learn how to use Power BI to visualize data i
44
author: whhender
55
ms.author: whhender
66
ms.reviewer: whhender
7-
ms.date: 10/16/2023
7+
ms.date: 12/16/2024
88
ms.service: azure-synapse-analytics
99
ms.subservice: business-intelligence
1010
ms.topic: tutorial
1111
---
1212

13-
# Visualize data with Power BI
13+
# Tutorial: Visualize data with Power BI
1414

1515
In this tutorial, you learn how to create a Power BI workspace, link your Azure Synapse workspace, and create a Power BI data set that utilizes data in your Azure Synapse workspace.
1616

@@ -23,24 +23,29 @@ From the NYC Taxi data, we created aggregated datasets in two tables:
2323
- **nyctaxi.passengercountstats**
2424
- **SQLDB1.dbo.PassengerCountStats**
2525

26-
You can link a Power BI workspace to your Azure Synapse workspace. This capability allows you to easily get data into your Power BI workspace. You can edit your Power BI reports directly in your Azure Synapse workspace.
26+
You can link a Power BI workspace to your Azure Synapse workspace. This capability allows you to easily get data into your Power BI workspace. You can edit your Power BI reports directly in your Azure Synapse workspace.
2727

28-
### Create a Power BI workspace
28+
## Create a Power BI workspace
2929

3030
1. Sign in to [powerbi.microsoft.com](https://powerbi.microsoft.com/).
3131
1. Select **Workspaces**, then select **Create a workspace**. Create a new Power BI workspace named **NYCTaxiWorkspace1** or similar, since this name must be unique.
3232

33-
### Link your Azure Synapse workspace to your new Power BI workspace
33+
## Link your Azure Synapse workspace to your new Power BI workspace
3434

3535
1. In Synapse Studio, go to **Manage** > **Linked Services**.
3636
1. Select **New** > **Connect to Power BI**.
3737
1. Set **Name** to **NYCTaxiWorkspace1** or similar.
3838
1. Set **Workspace name** to the Power BI workspace you created earlier, similar to **NYCTaxiWorkspace1**.
39+
40+
>[!TIP]
41+
>If the workspace name doesn't load, select **Edit** and then enter your workspace ID. You can find the ID in the URL for the PowerBI workspace: `https://msit.powerbi.com/groups/<workspace id>/`
42+
3943
1. Select **Create**.
44+
1. Publish to create the linked service.
4045

41-
### Create a Power BI dataset that uses data in your Azure Synapse workspace
46+
## Create a Power BI dataset that uses data in your Azure Synapse workspace
4247

43-
1. In Synapse Studio, go to **Develop** > **Power BI**.
48+
1. In Synapse Studio, go to **Develop** > **Power BI**. (If you don't see Power BI, refresh the page.)
4449
1. Go to **NYCTaxiWorkspace1** > **Power BI datasets** and select **New Power BI dataset**. Select **Start**.
4550
1. Select the **SQLPOOL1** data source, select **Continue**.
4651
1. Select **Download** to download the `.pbids` file for your `NYCTaxiWorkspace1SQLPOOL1.pbids` file. Select **Continue**.
@@ -59,17 +64,17 @@ You can link a Power BI workspace to your Azure Synapse workspace. This capabili
5964
1. Select **Save** to save your changes.
6065
1. Choose the file name `PassengerAnalysis.pbix`, and then select **Save**.
6166
1. In the **Publish to Power BI** window, under **Select a destination**, choose your `NYCTaxiWorkspace1`, and then select **Select**.
62-
1. Wait for publishing to finish.
67+
1. Wait for publishing to finish.
6368

64-
### Configure authentication for your dataset
69+
## Configure authentication for your dataset
6570

6671
1. Open [powerbi.microsoft.com](https://powerbi.microsoft.com/) and **Sign in**.
6772
1. On the left side, under **Workspaces**, select the **NYCTaxiWorkspace1** workspace.
6873
1. Inside that workspace, locate a dataset called **Passenger Analysis** and a report called **Passenger Analysis**.
6974
1. Hover over the **PassengerAnalysis** dataset, select the ellipsis (...) button, and then select **Settings**.
7075
1. In **Data source credentials**, select **Edit**, set the **Authentication method** to **OAuth2**, and then select **Sign in**.
7176

72-
### Edit a report in Synapse Studio
77+
## Edit a report in Synapse Studio
7378

7479
1. Go back to Synapse Studio and select **Close and refresh**.
7580
1. Go to the **Develop** hub.

articles/synapse-analytics/quickstart-create-workspace.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,21 @@ author: whhender
55
ms.service: azure-synapse-analytics
66
ms.topic: quickstart
77
ms.subservice: workspace
8-
ms.date: 03/23/2022
8+
ms.date: 12/16/2024
99
ms.author: whhender
1010
ms.reviewer: whhender
1111
ms.custom: subject-rbac-steps, mode-other
1212
---
1313

1414
# Quickstart: Create an Azure Synapse Analytics workspace
1515

16-
This quickstart describes the steps to create an Azure Synapse Analytics workspace by using the Azure portal.
16+
This quickstart describes the steps to create an Azure Synapse Analytics workspace using the Azure portal.
1717

1818
## Create an Azure Synapse Analytics workspace
1919

2020
1. Open the [Azure portal](https://portal.azure.com), and at the top, search for **Synapse**.
2121
1. In the search results, under **Services**, select **Azure Synapse Analytics**.
22-
1. Select **Add** to create a workspace.
22+
1. Select **Create** to create a workspace.
2323
1. On the **Basics** tab, give the workspace a unique name. We use **mysworkspace** in this document.
2424
1. You need an Azure Data Lake Storage Gen2 account to create a workspace. The simplest choice is to create a new one. If you want to reuse an existing one, you need to perform extra configuration:
2525

@@ -67,7 +67,7 @@ Managed identities for your Azure Synapse Analytics workspace might already have
6767
1. Select **Access control (IAM)**.
6868
1. Select **Add** > **Add role assignment** to open the **Add role assignment** page.
6969
1. Assign the following role. For more information, see [Assign Azure roles by using the Azure portal](../role-based-access-control/role-assignments-portal.yml).
70-
70+
7171
| Setting | Value |
7272
| --- | --- |
7373
| Role | Storage Blob Data Contributor |
@@ -82,6 +82,6 @@ Managed identities for your Azure Synapse Analytics workspace might already have
8282

8383
## Related content
8484

85-
* [Create a dedicated SQL pool](quickstart-create-sql-pool-studio.md)
86-
* [Create a serverless Apache Spark pool](quickstart-create-apache-spark-pool-portal.md)
87-
* [Use a serverless SQL pool](quickstart-sql-on-demand.md)
85+
- [Create a dedicated SQL pool](quickstart-create-sql-pool-studio.md)
86+
- [Create a serverless Apache Spark pool](quickstart-create-apache-spark-pool-portal.md)
87+
- [Use a serverless SQL pool](quickstart-sql-on-demand.md)

articles/synapse-analytics/sql/develop-tables-data-types.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@ description: Recommendations for defining table data types in Synapse SQL.
44
author: filippopovic
55
ms.author: fipopovi
66
ms.reviewer: whhender
7-
ms.date: 04/15/2020
7+
ms.date: 12/17/2024
88
ms.service: azure-synapse-analytics
99
ms.subservice: sql
10-
ms.topic: conceptual
10+
ms.topic: concept-article
1111
---
1212

1313
# Table data types in Synapse SQL
@@ -16,7 +16,7 @@ In this article, you'll find recommendations for defining table data types in Sy
1616

1717
## Data types
1818

19-
Synapse SQL Dedicated Pool supports the most commonly used data types. For a list of the supported data types, see [data types](/sql/t-sql/statements/create-table-azure-sql-data-warehouse#DataTypes&preserve-view=true) in the CREATE TABLE statement. For Synapse SQL Serverless please refer to article [Query storage files with serverless SQL pool in Azure Synapse Analytics](./query-data-storage.md) and [How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics](./develop-openrowset.md)
19+
Synapse SQL Dedicated Pool supports the most commonly used data types. For a list of the supported data types, see [data types](/sql/t-sql/statements/create-table-azure-sql-data-warehouse#DataTypes&preserve-view=true) in the CREATE TABLE statement. For Synapse SQL Serverless, refer to article [Query storage files with serverless SQL pool in Azure Synapse Analytics](./query-data-storage.md) and [How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics](./develop-openrowset.md)
2020

2121
## Minimize row length
2222

@@ -25,14 +25,14 @@ Minimizing the size of data types shortens the row length, which leads to better
2525
- Avoid defining character columns with a large default length. For example, if the longest value is 25 characters, then define your column as VARCHAR(25).
2626
- Avoid using NVARCHAR when you only need VARCHAR.
2727
- When possible, use NVARCHAR(4000) or VARCHAR(8000) instead of NVARCHAR(MAX) or VARCHAR(MAX).
28-
- Avoid using floats and decimals with 0 (zero) scale. These should be TINYINT, SMALLINT, INT or BIGINT.
28+
- Avoid using floats and decimals with 0 (zero) scale. These should be TINYINT, SMALLINT, INT, or BIGINT.
2929

3030
> [!NOTE]
3131
> If you are using PolyBase external tables to load your Synapse SQL tables, the defined length of the table row cannot exceed 1 MB. When a row with variable-length data exceeds 1 MB, you can load the row with BCP, but not with PolyBase.
3232
3333
## Identify unsupported data types
3434

35-
If you are migrating your database from another SQL database, you might encounter data types that are not supported in Synapse SQL. Use this query to discover unsupported data types in your existing SQL schema.
35+
If you're migrating your database from another SQL database, you might encounter data types that aren't supported in Synapse SQL. Use this query to discover unsupported data types in your existing SQL schema.
3636

3737
```sql
3838
SELECT t.[name], c.[name], c.[system_type_id], c.[user_type_id], y.[is_user_defined], y.[name]
@@ -45,7 +45,7 @@ WHERE y.[name] IN ('geography','geometry','hierarchyid','image','text','ntext','
4545

4646
## <a name="unsupported-data-types"></a>Workarounds for unsupported data types
4747

48-
The following list shows the data types that Synapse SQL does not support and gives alternatives that you can use instead of the unsupported data types.
48+
The following list shows the data types that Synapse SQL doesn't support and gives alternatives that you can use instead of the unsupported data types.
4949

5050
| Unsupported data type | Workaround |
5151
| --- | --- |
@@ -57,11 +57,11 @@ The following list shows the data types that Synapse SQL does not support and gi
5757
| [ntext](/sql/t-sql/data-types/ntext-text-and-image-transact-sql?view=azure-sqldw-latest&preserve-view=true) |[nvarchar](/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=azure-sqldw-latest&preserve-view=true) |
5858
| [sql_variant](/sql/t-sql/data-types/sql-variant-transact-sql?view=azure-sqldw-latest&preserve-view=true) |Split column into several strongly typed columns. |
5959
| [table](/sql/t-sql/data-types/table-transact-sql?view=azure-sqldw-latest&preserve-view=true) |Convert to temporary tables or consider storing data to storage using [CETAS](../sql/develop-tables-cetas.md). |
60-
| [timestamp](/sql/t-sql/data-types/date-and-time-types) |Rework code to use [datetime2](/sql/t-sql/data-types/datetime2-transact-sql?view=azure-sqldw-latest&preserve-view=true) and the [CURRENT_TIMESTAMP](/sql/t-sql/functions/current-timestamp-transact-sql?view=azure-sqldw-latest&preserve-view=true) function. Only constants are supported as defaults, therefore current_timestamp cannot be defined as a default constraint. If you need to migrate row version values from a timestamp typed column, then use [BINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) or [VARBINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) for NOT NULL or NULL row version values. |
60+
| [timestamp](/sql/t-sql/data-types/date-and-time-types) |Rework code to use [datetime2](/sql/t-sql/data-types/datetime2-transact-sql?view=azure-sqldw-latest&preserve-view=true) and the [CURRENT_TIMESTAMP](/sql/t-sql/functions/current-timestamp-transact-sql?view=azure-sqldw-latest&preserve-view=true) function. Only constants are supported as defaults, therefore current_timestamp can't be defined as a default constraint. If you need to migrate row version values from a timestamp typed column, then use [BINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) or [VARBINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) for NOT NULL or NULL row version values. |
6161
| [xml](/sql/t-sql/xml/xml-transact-sql?view=azure-sqldw-latest&preserve-view=true) |[varchar](/sql/t-sql/data-types/char-and-varchar-transact-sql?view=azure-sqldw-latest&preserve-view=true) |
6262
| [user-defined type](/sql/relational-databases/native-client/features/using-user-defined-types) |Convert back to the native data type when possible. |
6363
| default values | Default values support literals and constants only. |
6464

65-
## Next steps
65+
## Related content
6666

67-
For more information on developing tables, see [Table Overview](develop-overview.md).
67+
For more information on developing tables, see [the development overview](develop-overview.md).

articles/synapse-analytics/sql/query-delta-lake-format.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: synapse analytics
55
ms.service: azure-synapse-analytics
66
ms.topic: how-to
77
ms.subservice: sql
8-
ms.date: 02/15/2023
8+
ms.date: 12/17/2024
99
author: jovanpop-msft
1010
ms.author: jovanpop
1111
ms.reviewer: whhender, wiassaf
@@ -23,7 +23,7 @@ You can learn more from the [how to query delta lake tables video](https://www.y
2323
The serverless SQL pool in Synapse workspace enables you to read the data stored in Delta Lake format, and serve it to reporting tools.
2424
A serverless SQL pool can read Delta Lake files that are created using Apache Spark, Azure Databricks, or any other producer of the Delta Lake format.
2525

26-
Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
26+
Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
2727

2828
> [!IMPORTANT]
2929
> Querying Delta Lake format using the serverless SQL pool is **Generally available** functionality. However, querying Spark Delta tables is still in public preview and not production ready. There are known issues that might happen if you query Delta tables created using the Spark pools. See the known issues in [Serverless SQL pool self-help](resources-self-help-sql-on-demand.md#delta-lake).
@@ -50,7 +50,7 @@ The URI in the `OPENROWSET` function must reference the root Delta Lake folder t
5050
> [!div class="mx-imgBorder"]
5151
>![ECDC COVID-19 Delta Lake folder](./media/shared/covid-delta-lake-studio.png)
5252
53-
If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
53+
If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
5454

5555
```python
5656
%%pyspark
@@ -64,7 +64,7 @@ To improve the performance of your queries, consider specifying explicit types i
6464
> The serverless Synapse SQL pool uses schema inference to automatically determine columns and their types. The rules for schema inference are the same used for Parquet files.
6565
> For Delta Lake type mapping to SQL native type check [type mapping for Parquet](develop-openrowset.md#type-mapping-for-parquet).
6666
67-
Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you will need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
67+
Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you'll need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
6868

6969
> [!IMPORTANT]
7070
> Ensure you are using a UTF-8 database collation (for example `Latin1_General_100_BIN2_UTF8`) because string values in Delta Lake files are encoded using UTF-8 encoding.
@@ -80,7 +80,7 @@ The previous examples used the full path to the file. As an alternative, you can
8080
> [!IMPORTANT]
8181
> Data sources can be created only in custom databases (not in the master database or the databases replicated from Apache Spark pools).
8282
83-
To use the samples below, you will need to complete the following step:
83+
To use the samples below, you'll need to complete the following step:
8484
1. **Create a database** with a datasource that references [NYC Yellow Taxi](https://azure.microsoft.com/services/open-datasets/catalog/nyc-taxi-limousine-commission-yellow-taxi-trip-records/) storage account.
8585
1. Initialize the objects by executing [setup script](https://github.com/Azure-Samples/Synapse/blob/master/SQL/Samples/LdwSample/SampleDB.sql) on the database you created in step 1. This setup script will create the data sources, database scoped credentials, and external file formats that are used in these samples.
8686

@@ -172,7 +172,7 @@ The folder name in the `OPENROWSET` function (`yellow` in this example) is conca
172172
> [!div class="mx-imgBorder"]
173173
>![Yellow Taxi Delta Lake folder](./media/shared/yellow-taxi-delta-lake.png)
174174
175-
If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
175+
If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
176176

177177
```python
178178
%%pyspark
@@ -186,13 +186,11 @@ The second argument of `DeltaTable.convertToDeltaLake` function represents the p
186186

187187
- Review the limitations and the known issues on [Synapse serverless SQL pool self-help page](resources-self-help-sql-on-demand.md#delta-lake).
188188

189-
## Next steps
189+
## Related content
190190

191191
Advance to the next article to learn how to [Query Parquet nested types](query-parquet-nested-types.md).
192192
If you want to continue building Delta Lake solution, learn how to create [views](create-use-views.md#delta-lake-views) or [external tables](create-use-external-tables.md#delta-lake-external-table) on the Delta Lake folder.
193193

194-
## See also
195-
196194
- [What is Delta Lake](../spark/apache-spark-what-is-delta-lake.md)
197195
- [Learn how to use Delta Lake in Apache Spark pools for Azure Synapse Analytics](../spark/apache-spark-delta-lake-overview.md)
198196
- [Azure Databricks Delta Lake best practices](/azure/databricks/delta/best-practices)

0 commit comments

Comments
 (0)