You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/get-started-analyze-storage.md
+5-9Lines changed: 5 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,21 +7,19 @@ ms.reviewer: whhender
7
7
ms.service: azure-synapse-analytics
8
8
ms.subservice: workspace
9
9
ms.topic: tutorial
10
-
ms.date: 11/18/2022
10
+
ms.date: 12/17/2024
11
11
---
12
12
13
-
# Analyze data in a storage account
13
+
# Tutorial: Analyze data in a storage account
14
14
15
15
In this tutorial, you'll learn how to analyze data located in a storage account.
16
16
17
-
## Overview
18
-
19
17
So far, we've covered scenarios where data resides in databases in the workspace. Now we'll show you how to work with files in storage accounts. In this scenario, we'll use the primary storage account of the workspace and container that we specified when creating the workspace.
20
18
21
19
* The name of the storage account: **contosolake**
22
20
* The name of the container in the storage account: **users**
23
21
24
-
###Create CSV and Parquet files in your storage account
22
+
## Create CSV and Parquet files in your storage account
25
23
26
24
Run the following code in a notebook in a new code cell. It creates a CSV file and a parquet file in the storage account.
You can analyze the data in your workspace default Azure Data Lake Storage (ADLS) Gen2 account or you can link an ADLS Gen2 or Blob storage account to your workspace through "**Manage**" > "**Linked Services**" > "**New**" (The next steps will refer to the primary ADLS Gen2 account).
42
40
@@ -69,9 +67,7 @@ You can analyze the data in your workspace default Azure Data Lake Storage (ADLS
69
67
70
68
1. Run the script.
71
69
72
-
73
-
74
-
## Next steps
70
+
## Next step
75
71
76
72
> [!div class="nextstepaction"]
77
73
> [Orchestrate activities with pipelines](get-started-pipelines.md)
Copy file name to clipboardExpand all lines: articles/synapse-analytics/get-started-visualize-power-bi.md
+15-10Lines changed: 15 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,13 +4,13 @@ description: In this tutorial, you learn how to use Power BI to visualize data i
4
4
author: whhender
5
5
ms.author: whhender
6
6
ms.reviewer: whhender
7
-
ms.date: 10/16/2023
7
+
ms.date: 12/16/2024
8
8
ms.service: azure-synapse-analytics
9
9
ms.subservice: business-intelligence
10
10
ms.topic: tutorial
11
11
---
12
12
13
-
# Visualize data with Power BI
13
+
# Tutorial: Visualize data with Power BI
14
14
15
15
In this tutorial, you learn how to create a Power BI workspace, link your Azure Synapse workspace, and create a Power BI data set that utilizes data in your Azure Synapse workspace.
16
16
@@ -23,24 +23,29 @@ From the NYC Taxi data, we created aggregated datasets in two tables:
23
23
-**nyctaxi.passengercountstats**
24
24
-**SQLDB1.dbo.PassengerCountStats**
25
25
26
-
You can link a Power BI workspace to your Azure Synapse workspace. This capability allows you to easily get data into your Power BI workspace. You can edit your Power BI reports directly in your Azure Synapse workspace.
26
+
You can link a Power BI workspace to your Azure Synapse workspace. This capability allows you to easily get data into your Power BI workspace. You can edit your Power BI reports directly in your Azure Synapse workspace.
27
27
28
-
###Create a Power BI workspace
28
+
## Create a Power BI workspace
29
29
30
30
1. Sign in to [powerbi.microsoft.com](https://powerbi.microsoft.com/).
31
31
1. Select **Workspaces**, then select **Create a workspace**. Create a new Power BI workspace named **NYCTaxiWorkspace1** or similar, since this name must be unique.
32
32
33
-
###Link your Azure Synapse workspace to your new Power BI workspace
33
+
## Link your Azure Synapse workspace to your new Power BI workspace
34
34
35
35
1. In Synapse Studio, go to **Manage** > **Linked Services**.
36
36
1. Select **New** > **Connect to Power BI**.
37
37
1. Set **Name** to **NYCTaxiWorkspace1** or similar.
38
38
1. Set **Workspace name** to the Power BI workspace you created earlier, similar to **NYCTaxiWorkspace1**.
39
+
40
+
>[!TIP]
41
+
>If the workspace name doesn't load, select **Edit** and then enter your workspace ID. You can find the ID in the URL for the PowerBI workspace: `https://msit.powerbi.com/groups/<workspace id>/`
42
+
39
43
1. Select **Create**.
44
+
1. Publish to create the linked service.
40
45
41
-
###Create a Power BI dataset that uses data in your Azure Synapse workspace
46
+
## Create a Power BI dataset that uses data in your Azure Synapse workspace
42
47
43
-
1. In Synapse Studio, go to **Develop** > **Power BI**.
48
+
1. In Synapse Studio, go to **Develop** > **Power BI**. (If you don't see Power BI, refresh the page.)
44
49
1. Go to **NYCTaxiWorkspace1** > **Power BI datasets** and select **New Power BI dataset**. Select **Start**.
45
50
1. Select the **SQLPOOL1** data source, select **Continue**.
46
51
1. Select **Download** to download the `.pbids` file for your `NYCTaxiWorkspace1SQLPOOL1.pbids` file. Select **Continue**.
@@ -59,17 +64,17 @@ You can link a Power BI workspace to your Azure Synapse workspace. This capabili
59
64
1. Select **Save** to save your changes.
60
65
1. Choose the file name `PassengerAnalysis.pbix`, and then select **Save**.
61
66
1. In the **Publish to Power BI** window, under **Select a destination**, choose your `NYCTaxiWorkspace1`, and then select **Select**.
62
-
1. Wait for publishing to finish.
67
+
1. Wait for publishing to finish.
63
68
64
-
###Configure authentication for your dataset
69
+
## Configure authentication for your dataset
65
70
66
71
1. Open [powerbi.microsoft.com](https://powerbi.microsoft.com/) and **Sign in**.
67
72
1. On the left side, under **Workspaces**, select the **NYCTaxiWorkspace1** workspace.
68
73
1. Inside that workspace, locate a dataset called **Passenger Analysis** and a report called **Passenger Analysis**.
69
74
1. Hover over the **PassengerAnalysis** dataset, select the ellipsis (...) button, and then select **Settings**.
70
75
1. In **Data source credentials**, select **Edit**, set the **Authentication method** to **OAuth2**, and then select **Sign in**.
71
76
72
-
###Edit a report in Synapse Studio
77
+
## Edit a report in Synapse Studio
73
78
74
79
1. Go back to Synapse Studio and select **Close and refresh**.
Copy file name to clipboardExpand all lines: articles/synapse-analytics/quickstart-create-workspace.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,21 +5,21 @@ author: whhender
5
5
ms.service: azure-synapse-analytics
6
6
ms.topic: quickstart
7
7
ms.subservice: workspace
8
-
ms.date: 03/23/2022
8
+
ms.date: 12/16/2024
9
9
ms.author: whhender
10
10
ms.reviewer: whhender
11
11
ms.custom: subject-rbac-steps, mode-other
12
12
---
13
13
14
14
# Quickstart: Create an Azure Synapse Analytics workspace
15
15
16
-
This quickstart describes the steps to create an Azure Synapse Analytics workspace by using the Azure portal.
16
+
This quickstart describes the steps to create an Azure Synapse Analytics workspace using the Azure portal.
17
17
18
18
## Create an Azure Synapse Analytics workspace
19
19
20
20
1. Open the [Azure portal](https://portal.azure.com), and at the top, search for **Synapse**.
21
21
1. In the search results, under **Services**, select **Azure Synapse Analytics**.
22
-
1. Select **Add** to create a workspace.
22
+
1. Select **Create** to create a workspace.
23
23
1. On the **Basics** tab, give the workspace a unique name. We use **mysworkspace** in this document.
24
24
1. You need an Azure Data Lake Storage Gen2 account to create a workspace. The simplest choice is to create a new one. If you want to reuse an existing one, you need to perform extra configuration:
25
25
@@ -67,7 +67,7 @@ Managed identities for your Azure Synapse Analytics workspace might already have
67
67
1. Select **Access control (IAM)**.
68
68
1. Select **Add** > **Add role assignment** to open the **Add role assignment** page.
69
69
1. Assign the following role. For more information, see [Assign Azure roles by using the Azure portal](../role-based-access-control/role-assignments-portal.yml).
70
-
70
+
71
71
| Setting | Value |
72
72
| --- | --- |
73
73
| Role | Storage Blob Data Contributor |
@@ -82,6 +82,6 @@ Managed identities for your Azure Synapse Analytics workspace might already have
82
82
83
83
## Related content
84
84
85
-
*[Create a dedicated SQL pool](quickstart-create-sql-pool-studio.md)
86
-
*[Create a serverless Apache Spark pool](quickstart-create-apache-spark-pool-portal.md)
87
-
*[Use a serverless SQL pool](quickstart-sql-on-demand.md)
85
+
-[Create a dedicated SQL pool](quickstart-create-sql-pool-studio.md)
86
+
-[Create a serverless Apache Spark pool](quickstart-create-apache-spark-pool-portal.md)
87
+
-[Use a serverless SQL pool](quickstart-sql-on-demand.md)
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql/develop-tables-data-types.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,10 +4,10 @@ description: Recommendations for defining table data types in Synapse SQL.
4
4
author: filippopovic
5
5
ms.author: fipopovi
6
6
ms.reviewer: whhender
7
-
ms.date: 04/15/2020
7
+
ms.date: 12/17/2024
8
8
ms.service: azure-synapse-analytics
9
9
ms.subservice: sql
10
-
ms.topic: conceptual
10
+
ms.topic: concept-article
11
11
---
12
12
13
13
# Table data types in Synapse SQL
@@ -16,7 +16,7 @@ In this article, you'll find recommendations for defining table data types in Sy
16
16
17
17
## Data types
18
18
19
-
Synapse SQL Dedicated Pool supports the most commonly used data types. For a list of the supported data types, see [data types](/sql/t-sql/statements/create-table-azure-sql-data-warehouse#DataTypes&preserve-view=true) in the CREATE TABLE statement. For Synapse SQL Serverless please refer to article [Query storage files with serverless SQL pool in Azure Synapse Analytics](./query-data-storage.md) and [How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics](./develop-openrowset.md)
19
+
Synapse SQL Dedicated Pool supports the most commonly used data types. For a list of the supported data types, see [data types](/sql/t-sql/statements/create-table-azure-sql-data-warehouse#DataTypes&preserve-view=true) in the CREATE TABLE statement. For Synapse SQL Serverless, refer to article [Query storage files with serverless SQL pool in Azure Synapse Analytics](./query-data-storage.md) and [How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics](./develop-openrowset.md)
20
20
21
21
## Minimize row length
22
22
@@ -25,14 +25,14 @@ Minimizing the size of data types shortens the row length, which leads to better
25
25
- Avoid defining character columns with a large default length. For example, if the longest value is 25 characters, then define your column as VARCHAR(25).
26
26
- Avoid using NVARCHAR when you only need VARCHAR.
27
27
- When possible, use NVARCHAR(4000) or VARCHAR(8000) instead of NVARCHAR(MAX) or VARCHAR(MAX).
28
-
- Avoid using floats and decimals with 0 (zero) scale. These should be TINYINT, SMALLINT, INT or BIGINT.
28
+
- Avoid using floats and decimals with 0 (zero) scale. These should be TINYINT, SMALLINT, INT, or BIGINT.
29
29
30
30
> [!NOTE]
31
31
> If you are using PolyBase external tables to load your Synapse SQL tables, the defined length of the table row cannot exceed 1 MB. When a row with variable-length data exceeds 1 MB, you can load the row with BCP, but not with PolyBase.
32
32
33
33
## Identify unsupported data types
34
34
35
-
If you are migrating your database from another SQL database, you might encounter data types that are not supported in Synapse SQL. Use this query to discover unsupported data types in your existing SQL schema.
35
+
If you're migrating your database from another SQL database, you might encounter data types that aren't supported in Synapse SQL. Use this query to discover unsupported data types in your existing SQL schema.
@@ -45,7 +45,7 @@ WHERE y.[name] IN ('geography','geometry','hierarchyid','image','text','ntext','
45
45
46
46
## <aname="unsupported-data-types"></a>Workarounds for unsupported data types
47
47
48
-
The following list shows the data types that Synapse SQL does not support and gives alternatives that you can use instead of the unsupported data types.
48
+
The following list shows the data types that Synapse SQL doesn't support and gives alternatives that you can use instead of the unsupported data types.
49
49
50
50
| Unsupported data type | Workaround |
51
51
| --- | --- |
@@ -57,11 +57,11 @@ The following list shows the data types that Synapse SQL does not support and gi
|[sql_variant](/sql/t-sql/data-types/sql-variant-transact-sql?view=azure-sqldw-latest&preserve-view=true)|Split column into several strongly typed columns. |
59
59
|[table](/sql/t-sql/data-types/table-transact-sql?view=azure-sqldw-latest&preserve-view=true)|Convert to temporary tables or consider storing data to storage using [CETAS](../sql/develop-tables-cetas.md). |
60
-
|[timestamp](/sql/t-sql/data-types/date-and-time-types)|Rework code to use [datetime2](/sql/t-sql/data-types/datetime2-transact-sql?view=azure-sqldw-latest&preserve-view=true) and the [CURRENT_TIMESTAMP](/sql/t-sql/functions/current-timestamp-transact-sql?view=azure-sqldw-latest&preserve-view=true) function. Only constants are supported as defaults, therefore current_timestamp cannot be defined as a default constraint. If you need to migrate row version values from a timestamp typed column, then use [BINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) or [VARBINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) for NOT NULL or NULL row version values. |
60
+
|[timestamp](/sql/t-sql/data-types/date-and-time-types)|Rework code to use [datetime2](/sql/t-sql/data-types/datetime2-transact-sql?view=azure-sqldw-latest&preserve-view=true) and the [CURRENT_TIMESTAMP](/sql/t-sql/functions/current-timestamp-transact-sql?view=azure-sqldw-latest&preserve-view=true) function. Only constants are supported as defaults, therefore current_timestamp can't be defined as a default constraint. If you need to migrate row version values from a timestamp typed column, then use [BINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) or [VARBINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) for NOT NULL or NULL row version values. |
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql/query-delta-lake-format.md
+7-9Lines changed: 7 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ services: synapse analytics
5
5
ms.service: azure-synapse-analytics
6
6
ms.topic: how-to
7
7
ms.subservice: sql
8
-
ms.date: 02/15/2023
8
+
ms.date: 12/17/2024
9
9
author: jovanpop-msft
10
10
ms.author: jovanpop
11
11
ms.reviewer: whhender, wiassaf
@@ -23,7 +23,7 @@ You can learn more from the [how to query delta lake tables video](https://www.y
23
23
The serverless SQL pool in Synapse workspace enables you to read the data stored in Delta Lake format, and serve it to reporting tools.
24
24
A serverless SQL pool can read Delta Lake files that are created using Apache Spark, Azure Databricks, or any other producer of the Delta Lake format.
25
25
26
-
Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
26
+
Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
27
27
28
28
> [!IMPORTANT]
29
29
> Querying Delta Lake format using the serverless SQL pool is **Generally available** functionality. However, querying Spark Delta tables is still in public preview and not production ready. There are known issues that might happen if you query Delta tables created using the Spark pools. See the known issues in [Serverless SQL pool self-help](resources-self-help-sql-on-demand.md#delta-lake).
@@ -50,7 +50,7 @@ The URI in the `OPENROWSET` function must reference the root Delta Lake folder t
50
50
> [!div class="mx-imgBorder"]
51
51
>
52
52
53
-
If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
53
+
If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
54
54
55
55
```python
56
56
%%pyspark
@@ -64,7 +64,7 @@ To improve the performance of your queries, consider specifying explicit types i
64
64
> The serverless Synapse SQL pool uses schema inference to automatically determine columns and their types. The rules for schema inference are the same used for Parquet files.
65
65
> For Delta Lake type mapping to SQL native type check [type mapping for Parquet](develop-openrowset.md#type-mapping-for-parquet).
66
66
67
-
Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you will need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
67
+
Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you'll need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
68
68
69
69
> [!IMPORTANT]
70
70
> Ensure you are using a UTF-8 database collation (for example `Latin1_General_100_BIN2_UTF8`) because string values in Delta Lake files are encoded using UTF-8 encoding.
@@ -80,7 +80,7 @@ The previous examples used the full path to the file. As an alternative, you can
80
80
> [!IMPORTANT]
81
81
> Data sources can be created only in custom databases (not in the master database or the databases replicated from Apache Spark pools).
82
82
83
-
To use the samples below, you will need to complete the following step:
83
+
To use the samples below, you'll need to complete the following step:
84
84
1.**Create a database** with a datasource that references [NYC Yellow Taxi](https://azure.microsoft.com/services/open-datasets/catalog/nyc-taxi-limousine-commission-yellow-taxi-trip-records/) storage account.
85
85
1. Initialize the objects by executing [setup script](https://github.com/Azure-Samples/Synapse/blob/master/SQL/Samples/LdwSample/SampleDB.sql) on the database you created in step 1. This setup script will create the data sources, database scoped credentials, and external file formats that are used in these samples.
86
86
@@ -172,7 +172,7 @@ The folder name in the `OPENROWSET` function (`yellow` in this example) is conca
172
172
> [!div class="mx-imgBorder"]
173
173
>
174
174
175
-
If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
175
+
If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
176
176
177
177
```python
178
178
%%pyspark
@@ -186,13 +186,11 @@ The second argument of `DeltaTable.convertToDeltaLake` function represents the p
186
186
187
187
- Review the limitations and the known issues on [Synapse serverless SQL pool self-help page](resources-self-help-sql-on-demand.md#delta-lake).
188
188
189
-
## Next steps
189
+
## Related content
190
190
191
191
Advance to the next article to learn how to [Query Parquet nested types](query-parquet-nested-types.md).
192
192
If you want to continue building Delta Lake solution, learn how to create [views](create-use-views.md#delta-lake-views) or [external tables](create-use-external-tables.md#delta-lake-external-table) on the Delta Lake folder.
193
193
194
-
## See also
195
-
196
194
-[What is Delta Lake](../spark/apache-spark-what-is-delta-lake.md)
197
195
-[Learn how to use Delta Lake in Apache Spark pools for Azure Synapse Analytics](../spark/apache-spark-delta-lake-overview.md)
198
196
-[Azure Databricks Delta Lake best practices](/azure/databricks/delta/best-practices)
0 commit comments