Merge pull request #292092 from whhender/december-synapse-freshness

prmerger-automator[bot] · web-flow · commit f60bde7ccb28 · 2024-12-18T15:22:14.000Z
December synapse freshness part 2
diff --git a/articles/synapse-analytics/get-started-analyze-storage.md b/articles/synapse-analytics/get-started-analyze-storage.md
@@ -7,21 +7,19 @@ ms.reviewer: whhender
 ms.service: azure-synapse-analytics
 ms.subservice: workspace
 ms.topic: tutorial
-ms.date: 11/18/2022
+ms.date: 12/17/2024
 ---
 
-# Analyze data in a storage account
+# Tutorial: Analyze data in a storage account
 
 In this tutorial, you'll learn how to analyze data located in a storage account.
 
-## Overview
-
 So far, we've covered scenarios where data resides in databases in the workspace. Now we'll show you how to work with files in storage accounts. In this scenario, we'll use the primary storage account of the workspace and container that we specified when creating the workspace.
 
 * The name of the storage account: **contosolake**
 * The name of the container in the storage account: **users**
 
-### Create CSV and Parquet files in your storage account
+## Create CSV and Parquet files in your storage account
 
 Run the following code in a notebook in a new code cell. It creates a CSV file and a parquet file in the storage account.
 
@@ -36,7 +34,7 @@ df.write.mode("overwrite").csv("/NYCTaxi/PassengerCountStats_csvformat")
 df.write.mode("overwrite").parquet("/NYCTaxi/PassengerCountStats_parquetformat")
 ```
 
-### Analyze data in a storage account
+## Analyze data in a storage account
 
 You can analyze the data in your workspace default Azure Data Lake Storage (ADLS) Gen2 account or you can link an ADLS Gen2 or Blob storage account to your workspace through "**Manage**" > "**Linked Services**" > "**New**" (The next steps will refer to the primary ADLS Gen2 account).
 
@@ -69,9 +67,7 @@ You can analyze the data in your workspace default Azure Data Lake Storage (ADLS
 
 1. Run the script.
 
-
-
-## Next steps
+## Next step
 
 > [!div class="nextstepaction"]
 > [Orchestrate activities with pipelines](get-started-pipelines.md)
diff --git a/articles/synapse-analytics/get-started-visualize-power-bi.md b/articles/synapse-analytics/get-started-visualize-power-bi.md
@@ -4,13 +4,13 @@ description: In this tutorial, you learn how to use Power BI to visualize data i
 author: whhender
 ms.author: whhender
 ms.reviewer: whhender
-ms.date: 10/16/2023
+ms.date: 12/16/2024
 ms.service: azure-synapse-analytics
 ms.subservice: business-intelligence
 ms.topic: tutorial
 ---
 
-# Visualize data with Power BI
+# Tutorial: Visualize data with Power BI
 
 In this tutorial, you learn how to create a Power BI workspace, link your Azure Synapse workspace, and create a Power BI data set that utilizes data in your Azure Synapse workspace. 
 
@@ -23,24 +23,29 @@ From the NYC Taxi data, we created aggregated datasets in two tables:
 - **nyctaxi.passengercountstats**
 - **SQLDB1.dbo.PassengerCountStats**
 
-You can link a Power BI workspace to your Azure Synapse workspace. This capability allows you to easily get data into your Power BI workspace. You can edit your Power BI reports directly in your Azure Synapse workspace. 
+You can link a Power BI workspace to your Azure Synapse workspace. This capability allows you to easily get data into your Power BI workspace. You can edit your Power BI reports directly in your Azure Synapse workspace.
 
-### Create a Power BI workspace
+## Create a Power BI workspace
 
 1. Sign in to [powerbi.microsoft.com](https://powerbi.microsoft.com/).
 1. Select **Workspaces**, then select **Create a workspace**. Create a new Power BI workspace named **NYCTaxiWorkspace1** or similar, since this name must be unique.
 
-### Link your Azure Synapse workspace to your new Power BI workspace
+## Link your Azure Synapse workspace to your new Power BI workspace
 
 1. In Synapse Studio, go to **Manage** > **Linked Services**.
 1. Select **New** > **Connect to Power BI**.
 1. Set **Name** to **NYCTaxiWorkspace1** or similar.
 1. Set **Workspace name** to the Power BI workspace you created earlier, similar to **NYCTaxiWorkspace1**.
+
+    >[!TIP]
+    >If the workspace name doesn't load, select **Edit** and then enter your workspace ID. You can find the ID in the URL for the PowerBI workspace: `https://msit.powerbi.com/groups/<workspace id>/`
+
 1. Select **Create**.
+1. Publish to create the linked service.
 
-### Create a Power BI dataset that uses data in your Azure Synapse workspace
+## Create a Power BI dataset that uses data in your Azure Synapse workspace
 
-1. In Synapse Studio, go to **Develop** > **Power BI**.
+1. In Synapse Studio, go to **Develop** > **Power BI**. (If you don't see Power BI, refresh the page.)
 1. Go to **NYCTaxiWorkspace1** > **Power BI datasets** and select **New Power BI dataset**. Select **Start**.
 1. Select the **SQLPOOL1** data source, select **Continue**.
 1. Select **Download** to download the `.pbids` file for your `NYCTaxiWorkspace1SQLPOOL1.pbids` file. Select **Continue**.
@@ -59,17 +64,17 @@ You can link a Power BI workspace to your Azure Synapse workspace. This capabili
 1. Select **Save** to save your changes.
 1. Choose the file name `PassengerAnalysis.pbix`, and then select **Save**.
 1. In the **Publish to Power BI** window, under **Select a destination**, choose your `NYCTaxiWorkspace1`, and then select **Select**.
-1. Wait for publishing to finish. 
+1. Wait for publishing to finish.
 
-### Configure authentication for your dataset
+## Configure authentication for your dataset
 
 1. Open [powerbi.microsoft.com](https://powerbi.microsoft.com/) and **Sign in**.
 1. On the left side, under **Workspaces**, select the **NYCTaxiWorkspace1** workspace.
 1. Inside that workspace, locate a dataset called **Passenger Analysis** and a report called **Passenger Analysis**.
 1. Hover over the **PassengerAnalysis** dataset, select the ellipsis (...) button, and then select **Settings**.
 1. In **Data source credentials**, select **Edit**, set the **Authentication method** to **OAuth2**, and then select **Sign in**.
 
-### Edit a report in Synapse Studio
+## Edit a report in Synapse Studio
 
 1. Go back to Synapse Studio and select **Close and refresh**.
 1. Go to the **Develop** hub.
diff --git a/articles/synapse-analytics/quickstart-create-workspace.md b/articles/synapse-analytics/quickstart-create-workspace.md
@@ -5,21 +5,21 @@ author: whhender
 ms.service: azure-synapse-analytics
 ms.topic: quickstart
 ms.subservice: workspace
-ms.date: 03/23/2022
+ms.date: 12/16/2024
 ms.author: whhender
 ms.reviewer: whhender
 ms.custom: subject-rbac-steps, mode-other
 ---
 
 # Quickstart: Create an Azure Synapse Analytics workspace
 
-This quickstart describes the steps to create an Azure Synapse Analytics workspace by using the Azure portal.
+This quickstart describes the steps to create an Azure Synapse Analytics workspace using the Azure portal.
 
 ## Create an Azure Synapse Analytics workspace
 
 1. Open the [Azure portal](https://portal.azure.com), and at the top, search for **Synapse**.
 1. In the search results, under **Services**, select **Azure Synapse Analytics**.
-1. Select **Add** to create a workspace.
+1. Select **Create** to create a workspace.
 1. On the **Basics** tab, give the workspace a unique name. We use **mysworkspace** in this document.
 1. You need an Azure Data Lake Storage Gen2 account to create a workspace. The simplest choice is to create a new one. If you want to reuse an existing one, you need to perform extra configuration:
 
@@ -67,7 +67,7 @@ Managed identities for your Azure Synapse Analytics workspace might already have
 1. Select **Access control (IAM)**.
 1. Select **Add** > **Add role assignment** to open the **Add role assignment** page.
 1. Assign the following role. For more information, see [Assign Azure roles by using the Azure portal](../role-based-access-control/role-assignments-portal.yml).
-    
+
     | Setting | Value |
     | --- | --- |
     | Role | Storage Blob Data Contributor |
@@ -82,6 +82,6 @@ Managed identities for your Azure Synapse Analytics workspace might already have
 
 ## Related content
 
-* [Create a dedicated SQL pool](quickstart-create-sql-pool-studio.md) 
-* [Create a serverless Apache Spark pool](quickstart-create-apache-spark-pool-portal.md)
-* [Use a serverless SQL pool](quickstart-sql-on-demand.md)
+- [Create a dedicated SQL pool](quickstart-create-sql-pool-studio.md) 
+- [Create a serverless Apache Spark pool](quickstart-create-apache-spark-pool-portal.md)
+- [Use a serverless SQL pool](quickstart-sql-on-demand.md)
diff --git a/articles/synapse-analytics/sql/develop-tables-data-types.md b/articles/synapse-analytics/sql/develop-tables-data-types.md
@@ -4,10 +4,10 @@ description: Recommendations for defining table data types in Synapse SQL.
 author: filippopovic
 ms.author: fipopovi
 ms.reviewer: whhender
-ms.date: 04/15/2020
+ms.date: 12/17/2024
 ms.service: azure-synapse-analytics
 ms.subservice: sql
-ms.topic: conceptual
+ms.topic: concept-article
 ---
 
 # Table data types in Synapse SQL
@@ -16,7 +16,7 @@ In this article, you'll find recommendations for defining table data types in Sy
 
 ## Data types
 
-Synapse SQL Dedicated Pool supports the most commonly used data types. For a list of the supported data types, see [data types](/sql/t-sql/statements/create-table-azure-sql-data-warehouse#DataTypes&preserve-view=true) in the CREATE TABLE statement. For Synapse SQL Serverless please refer to article [Query storage files with serverless SQL pool in Azure Synapse Analytics](./query-data-storage.md) and [How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics](./develop-openrowset.md)
+Synapse SQL Dedicated Pool supports the most commonly used data types. For a list of the supported data types, see [data types](/sql/t-sql/statements/create-table-azure-sql-data-warehouse#DataTypes&preserve-view=true) in the CREATE TABLE statement. For Synapse SQL Serverless, refer to article [Query storage files with serverless SQL pool in Azure Synapse Analytics](./query-data-storage.md) and [How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics](./develop-openrowset.md)
 
 ## Minimize row length
 
@@ -25,14 +25,14 @@ Minimizing the size of data types shortens the row length, which leads to better
 - Avoid defining character columns with a large default length. For example, if the longest value is 25 characters, then define your column as VARCHAR(25).
 - Avoid using NVARCHAR when you only need VARCHAR.
 - When possible, use NVARCHAR(4000) or VARCHAR(8000) instead of NVARCHAR(MAX) or VARCHAR(MAX).
-- Avoid using floats and decimals with 0 (zero) scale.  These should be TINYINT, SMALLINT, INT or BIGINT.
+- Avoid using floats and decimals with 0 (zero) scale. These should be TINYINT, SMALLINT, INT, or BIGINT.
 
 > [!NOTE]
 > If you are using PolyBase external tables to load your Synapse SQL tables, the defined length of the table row cannot exceed 1 MB. When a row with variable-length data exceeds 1 MB, you can load the row with BCP, but not with PolyBase.
 
 ## Identify unsupported data types
 
-If you are migrating your database from another SQL database, you might encounter data types that are not supported in Synapse SQL. Use this query to discover unsupported data types in your existing SQL schema.
+If you're migrating your database from another SQL database, you might encounter data types that aren't supported in Synapse SQL. Use this query to discover unsupported data types in your existing SQL schema.
 
 ```sql
 SELECT  t.[name], c.[name], c.[system_type_id], c.[user_type_id], y.[is_user_defined], y.[name]
@@ -45,7 +45,7 @@ WHERE y.[name] IN ('geography','geometry','hierarchyid','image','text','ntext','
 
 ## <a name="unsupported-data-types"></a>Workarounds for unsupported data types
 
-The following list shows the data types that Synapse SQL does not support and gives alternatives that you can use instead of the unsupported data types.
+The following list shows the data types that Synapse SQL doesn't support and gives alternatives that you can use instead of the unsupported data types.
 
 | Unsupported data type | Workaround |
 | --- | --- |
@@ -57,11 +57,11 @@ The following list shows the data types that Synapse SQL does not support and gi
 | [ntext](/sql/t-sql/data-types/ntext-text-and-image-transact-sql?view=azure-sqldw-latest&preserve-view=true) |[nvarchar](/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=azure-sqldw-latest&preserve-view=true) |
 | [sql_variant](/sql/t-sql/data-types/sql-variant-transact-sql?view=azure-sqldw-latest&preserve-view=true) |Split column into several strongly typed columns. |
 | [table](/sql/t-sql/data-types/table-transact-sql?view=azure-sqldw-latest&preserve-view=true) |Convert to temporary tables or consider storing data to storage using [CETAS](../sql/develop-tables-cetas.md). |
-| [timestamp](/sql/t-sql/data-types/date-and-time-types) |Rework code to use [datetime2](/sql/t-sql/data-types/datetime2-transact-sql?view=azure-sqldw-latest&preserve-view=true) and the [CURRENT_TIMESTAMP](/sql/t-sql/functions/current-timestamp-transact-sql?view=azure-sqldw-latest&preserve-view=true) function. Only constants are supported as defaults, therefore current_timestamp cannot be defined as a default constraint. If you need to migrate row version values from a timestamp typed column, then use [BINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) or [VARBINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) for NOT NULL or NULL row version values. |
+| [timestamp](/sql/t-sql/data-types/date-and-time-types) |Rework code to use [datetime2](/sql/t-sql/data-types/datetime2-transact-sql?view=azure-sqldw-latest&preserve-view=true) and the [CURRENT_TIMESTAMP](/sql/t-sql/functions/current-timestamp-transact-sql?view=azure-sqldw-latest&preserve-view=true) function. Only constants are supported as defaults, therefore current_timestamp can't be defined as a default constraint. If you need to migrate row version values from a timestamp typed column, then use [BINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) or [VARBINARY](/sql/t-sql/data-types/binary-and-varbinary-transact-sql?view=azure-sqldw-latest&preserve-view=true)(8) for NOT NULL or NULL row version values. |
 | [xml](/sql/t-sql/xml/xml-transact-sql?view=azure-sqldw-latest&preserve-view=true) |[varchar](/sql/t-sql/data-types/char-and-varchar-transact-sql?view=azure-sqldw-latest&preserve-view=true) |
 | [user-defined type](/sql/relational-databases/native-client/features/using-user-defined-types) |Convert back to the native data type when possible. |
 | default values | Default values support literals and constants only. |
 
-## Next steps
+## Related content
 
-For more information on developing tables, see [Table Overview](develop-overview.md).
+For more information on developing tables, see [the development overview](develop-overview.md).
diff --git a/articles/synapse-analytics/sql/query-delta-lake-format.md b/articles/synapse-analytics/sql/query-delta-lake-format.md
@@ -5,7 +5,7 @@ services: synapse analytics
 ms.service: azure-synapse-analytics
 ms.topic: how-to
 ms.subservice: sql
-ms.date: 02/15/2023
+ms.date: 12/17/2024
 author: jovanpop-msft
 ms.author: jovanpop
 ms.reviewer: whhender, wiassaf
@@ -23,7 +23,7 @@ You can learn more from the [how to query delta lake tables video](https://www.y
 The serverless SQL pool in Synapse workspace enables you to read the data stored in Delta Lake format, and serve it to reporting tools. 
 A serverless SQL pool can read Delta Lake files that are created using Apache Spark, Azure Databricks, or any other producer of the Delta Lake format.
 
-Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers. 
+Apache Spark pools in Azure Synapse enable data engineers to modify Delta Lake files using Scala, PySpark, and .NET. Serverless SQL pools help data analysts to create reports on Delta Lake files created by data engineers.
 
 > [!IMPORTANT]
 > Querying Delta Lake format using the serverless SQL pool is **Generally available** functionality. However, querying Spark Delta tables is still in public preview and not production ready. There are known issues that might happen if you query Delta tables created using the Spark pools. See the known issues in [Serverless SQL pool self-help](resources-self-help-sql-on-demand.md#delta-lake).
@@ -50,7 +50,7 @@ The URI in the `OPENROWSET` function must reference the root Delta Lake folder t
 > [!div class="mx-imgBorder"]
 >![ECDC COVID-19 Delta Lake folder](./media/shared/covid-delta-lake-studio.png)
 
-If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
+If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
 
 ```python
 %%pyspark
@@ -64,7 +64,7 @@ To improve the performance of your queries, consider specifying explicit types i
 > The serverless Synapse SQL pool uses schema inference to automatically determine columns and their types. The rules for schema inference are the same used for Parquet files.
 > For Delta Lake type mapping to SQL native type check [type mapping for Parquet](develop-openrowset.md#type-mapping-for-parquet). 
 
-Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you will need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
+Make sure you can access your file. If your file is protected with SAS key or custom Azure identity, you'll need to set up a [server level credential for sql login](develop-storage-files-storage-access-control.md?tabs=shared-access-signature#server-level-credential).
 
 > [!IMPORTANT]
 > Ensure you are using a UTF-8 database collation (for example `Latin1_General_100_BIN2_UTF8`) because string values in Delta Lake files are encoded using UTF-8 encoding.
@@ -80,7 +80,7 @@ The previous examples used the full path to the file. As an alternative, you can
 > [!IMPORTANT]
 > Data sources can be created only in custom databases (not in the master database or the databases replicated from Apache Spark pools). 
 
-To use the samples below, you will need to complete the following step:
+To use the samples below, you'll need to complete the following step:
 1. **Create a database** with a datasource that references [NYC Yellow Taxi](https://azure.microsoft.com/services/open-datasets/catalog/nyc-taxi-limousine-commission-yellow-taxi-trip-records/) storage account. 
 1. Initialize the objects by executing [setup script](https://github.com/Azure-Samples/Synapse/blob/master/SQL/Samples/LdwSample/SampleDB.sql) on the database you created in step 1. This setup script will create the data sources, database scoped credentials, and external file formats that are used in these samples.
 
@@ -172,7 +172,7 @@ The folder name in the `OPENROWSET` function (`yellow` in this example) is conca
 > [!div class="mx-imgBorder"]
 >![Yellow Taxi Delta Lake folder](./media/shared/yellow-taxi-delta-lake.png)
 
-If you don't have this subfolder, you are not using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
+If you don't have this subfolder, you aren't using Delta Lake format. You can convert your plain Parquet files in the folder to Delta Lake format using the following Apache Spark Python script:
 
 ```python
 %%pyspark
@@ -186,13 +186,11 @@ The second argument of `DeltaTable.convertToDeltaLake` function represents the p
 
 - Review the limitations and the known issues on [Synapse serverless SQL pool self-help page](resources-self-help-sql-on-demand.md#delta-lake).
 
-## Next steps
+## Related content
 
 Advance to the next article to learn how to [Query Parquet nested types](query-parquet-nested-types.md).
 If you want to continue building Delta Lake solution, learn how to create [views](create-use-views.md#delta-lake-views) or [external tables](create-use-external-tables.md#delta-lake-external-table) on the Delta Lake folder.
 
-## See also
-
 - [What is Delta Lake](../spark/apache-spark-what-is-delta-lake.md)
 - [Learn how to use Delta Lake in Apache Spark pools for Azure Synapse Analytics](../spark/apache-spark-delta-lake-overview.md)
 - [Azure Databricks Delta Lake best practices](/azure/databricks/delta/best-practices)