Merge branch 'MicrosoftDocs:main' into Artifact-streaming

tejaswikolli-web · web-flow · commit 7bd9a34890d1 · 2024-02-27T13:28:40.000-08:00
diff --git a/articles/stream-analytics/no-code-stream-processing.md b/articles/stream-analytics/no-code-stream-processing.md
@@ -165,7 +165,7 @@ The no-code editor now supports two reference data sources:
 
 Reference data is modeled as a sequence of blobs in ascending order of the date/time combination specified in the blob name. You can add blobs to the end of the sequence only by using a date/time greater than the one that the last blob specified in the sequence. Blobs are defined in the input configuration.
 
-First, under the **Inputs** section on the ribbon, select **Reference ADLS Gen2**. To see details about each field, see the section about Azure Blob Storage in [Use reference data for lookups in Stream Analytics](stream-analytics-use-reference-data.md#azure-blob-storage).
+First, under the **Inputs** section on the ribbon, select **Reference ADLS Gen2**. To see details about each field, see the section about Azure Blob Storage in [Use reference data for lookups in Stream Analytics](stream-analytics-use-reference-data.md#azure-blob-storage-or-azure-data-lake-storage-gen-2).
 
 ![Screenshot that shows fields for configuring Azure Data Lake Storage Gen2 as input in the no-code editor.](./media/no-code-stream-processing/blob-referencedata-nocode.png)
 
diff --git a/articles/stream-analytics/repartition.md b/articles/stream-analytics/repartition.md
@@ -4,7 +4,7 @@ description: This article describes how to use repartitioning to optimize Azure
 ms.service: stream-analytics
 author: ahartoon
 ms.author: anboisve
-ms.date: 12/21/2022
+ms.date: 02/26/2024
 ms.topic: conceptual
 ms.custom: mvc
 ---
@@ -27,6 +27,7 @@ You can repartition your input in two ways:
 
 ### Creating a separate Stream Analytics job to repartition input
 You can create a job that reads input and writes to an event hub output using a partition key. This event hub can then serve as input for another Stream Analytics job where you implement your analytics logic. When configuring this event hub output in your job, you must specify the partition key by which Stream Analytics will repartition your data. 
+
 ```sql
 -- For compat level 1.2 or higher
 SELECT * 
@@ -40,12 +41,13 @@ FROM input PARTITION BY PartitionId
 ```
 
 ### Repartition input within a single Stream Analytics job
-You can also introduce a step in your query that first repartitions the input and this can then be used by other steps in your query. For example, if you want to repartition input based on **DeviceId**, your query would be:
+You can also introduce a step in your query that first repartitions the input, which can then be used by other steps in your query. For example, if you want to repartition input based on **DeviceId**, your query would be:
+
 ```sql
 WITH RepartitionedInput AS 
 ( 
-SELECT * 
-FROM input PARTITION BY DeviceID
+    SELECT * 
+    FROM input PARTITION BY DeviceID
 )
 
 SELECT DeviceID, AVG(Reading) as AvgNormalReading  
@@ -54,13 +56,23 @@ FROM RepartitionedInput
 GROUP BY DeviceId, TumblingWindow(minute, 1)  
 ```
 
-The following example query joins two streams of repartitioned data. When joining two streams of repartitioned data, the streams must have the same partition key and count. The outcome is a stream that has the same partition scheme.
+The following example query joins two streams of repartitioned data. When you join two streams of repartitioned data, the streams must have the same partition key and count. The outcome is a stream that has the same partition scheme.
 
 ```sql
-WITH step1 AS (SELECT * FROM input1 PARTITION BY DeviceID),
-step2 AS (SELECT * FROM input2 PARTITION BY DeviceID)
+WITH step1 AS 
+(
+    SELECT * FROM input1 
+    PARTITION BY DeviceID
+),
+step2 AS 
+(
+    SELECT * FROM input2 
+    PARTITION BY DeviceID
+)
 
-SELECT * INTO output FROM step1 PARTITION BY DeviceID UNION step2 PARTITION BY DeviceID
+SELECT * INTO output 
+FROM step1 PARTITION BY DeviceID 
+UNION step2 PARTITION BY DeviceID
 ```
 
 The output scheme should match the stream scheme key and count so that each substream can be flushed independently. The stream could also be merged and repartitioned again by a different scheme before flushing, but you should avoid that method because it adds to the general latency of the processing and increases resource utilization.
@@ -71,14 +83,16 @@ Experiment and observe the resource usage of your job to determine the exact num
 
 ## Repartitions for SQL output
 
-When your job uses SQL database for output, use explicit repartitioning to match the optimal partition count to maximize throughput. Since SQL works best with eight writers, repartitioning the flow to eight before flushing, or somewhere further upstream, may benefit job performance. 
+When your job uses SQL database for output, use explicit repartitioning to match the optimal partition count to maximize throughput. Since SQL works best with eight writers, repartitioning the flow to eight before flushing, or somewhere further upstream, might benefit job performance. 
 
 When there are more than eight input partitions, inheriting the input partitioning scheme might not be an appropriate choice. Consider using [INTO](/stream-analytics-query/into-azure-stream-analytics#into-shard-count) in your query to explicitly specify the number of output writers. 
 
 The following example reads from the input, regardless of it being naturally partitioned, and repartitions the stream tenfold according to the DeviceID dimension and flushes the data to output. 
 
 ```sql
-SELECT * INTO [output] FROM [input] PARTITION BY DeviceID INTO 10
+SELECT * INTO [output] 
+FROM [input] 
+PARTITION BY DeviceID INTO 10
 ```
 
 For more information, see [Azure Stream Analytics output to Azure SQL Database](stream-analytics-sql-output-perf.md).
@@ -87,4 +101,4 @@ For more information, see [Azure Stream Analytics output to Azure SQL Database](
 ## Next steps
 
 * [Get started with Azure Stream Analytics](stream-analytics-introduction.md)
-* [Leverage query parallelization in Azure Stream Analytics](stream-analytics-parallelization.md)
+* [Use query parallelization in Azure Stream Analytics](stream-analytics-parallelization.md)
diff --git a/articles/stream-analytics/sql-database-upsert.md b/articles/stream-analytics/sql-database-upsert.md
@@ -3,12 +3,12 @@ title: Update or merge records in Azure SQL Database with Azure Functions
 description: This article describes how to use Azure Functions to update or merge records from Azure Stream Analytics to Azure SQL Database
 ms.service: stream-analytics
 ms.topic: how-to
-ms.date: 12/03/2021
+ms.date: 02/27/2024
 ---
 
 # Update or merge records in Azure SQL Database with Azure Functions
 
-Currently, [Azure Stream Analytics](./index.yml) (ASA) only supports inserting (appending) rows to SQL outputs ([Azure SQL Databases](./sql-database-output.md), and [Azure Synapse Analytics](./azure-synapse-analytics-output.md)). This article discusses workarounds to enable UPDATE, UPSERT, or MERGE on SQL databases, with Azure Functions as the intermediary layer.
+Currently, [Azure Stream Analytics](./index.yml) (ASA) supports only inserting (appending) rows to SQL outputs ([Azure SQL Databases](./sql-database-output.md), and [Azure Synapse Analytics](./azure-synapse-analytics-output.md)). This article discusses workarounds to enable UPDATE, UPSERT, or MERGE on SQL databases, with Azure Functions as the intermediary layer.
 
 Alternative options to Azure Functions are presented at the end.
 
@@ -22,14 +22,14 @@ Writing data in a table can generally be done in the following manner:
 |Replace|[MERGE](/sql/t-sql/statements/merge-transact-sql) (UPSERT)|Unique key|
 |Accumulate|MERGE (UPSERT) with compound assignment [operator](/sql/t-sql/queries/update-transact-sql#arguments) (`+=`, `-=`...)|Unique key and accumulator|
 
-To illustrate the differences, we can look at what happens when ingesting the following two records:
+To illustrate the differences, look at what happens when ingesting the following two records:
 
 |Arrival_Time|Device_Id|Measure_Value|
 |-|-|-|
 |10:00|A|1|
 |10:05|A|20|
 
-In **append** mode, we insert the two records. The equivalent T-SQL statement is:
+In the **append** mode, we insert two records. The equivalent T-SQL statement is:
 
 ```SQL
 INSERT INTO [target] VALUES (...);
@@ -42,7 +42,7 @@ Resulting in:
 |10:00|A|1|
 |10:05|A|20|
 
-In **replace** mode, we get only the last value by key. Here we will use **Device_Id as the key.** The equivalent T-SQL statement is:
+In **replace** mode, we get only the last value by key. Here we use **Device_Id as the key.** The equivalent T-SQL statement is:
 
 ```SQL
 MERGE INTO [target] t
@@ -65,7 +65,7 @@ Resulting in:
 |-|-|-|
 |10:05|A|20|
 
-Finally, in **accumulate** mode we sum `Value` with a compound assignment operator (`+=`). Here also we will use Device_Id as the key:
+Finally, in **accumulate** mode we sum `Value` with a compound assignment operator (`+=`). Here also we use Device_Id as the key:
 
 ```SQL
 MERGE INTO [target] t
@@ -90,15 +90,15 @@ Resulting in:
 
 For **performance** considerations, the ASA SQL database output adapters currently only support append mode natively. These adapters use bulk insert to maximize throughput and limit back pressure.
 
-This article shows how to use Azure Functions to implement Replace and Accumulate modes for ASA. By using a function as an intermediary layer, the potential write performance won't affect the streaming job. In this regard, using Azure Functions will work best with Azure SQL. With Synapse SQL, switching from bulk to row-by-row statements may create greater performance issues.
+This article shows how to use Azure Functions to implement Replace and Accumulate modes for ASA. When you use a function as an intermediary layer, the potential write performance won't affect the streaming job. In this regard, using Azure Functions works best with Azure SQL. With Synapse SQL, switching from bulk to row-by-row statements might create greater performance issues.
 
 ## Azure Functions Output
 
-In our job, we'll replace the ASA SQL output by the [ASA Azure Functions output](./azure-functions-output.md). The UPDATE, UPSERT, or MERGE capabilities will be implemented in the function.
+In our job, we replace the ASA SQL output by the [ASA Azure Functions output](./azure-functions-output.md). The UPDATE, UPSERT, or MERGE capabilities are implemented in the function.
 
 There are currently two options to access a SQL Database in a function. First is the [Azure SQL output binding](../azure-functions/functions-bindings-azure-sql.md). It's currently limited to C#, and only offers replace mode. Second is to compose a SQL query to be submitted via the appropriate [SQL driver](/sql/connect/sql-connection-libraries) ([Microsoft.Data.SqlClient](https://github.com/dotnet/SqlClient) for .NET).
 
-For both samples below, we'll assume the following table schema. The binding option requires **a primary key** to be set on the target table. It's not necessary, but recommended, when using a SQL driver.
+For both the following samples, we assume the following table schema. The binding option requires **a primary key** to be set on the target table. It's not necessary, but recommended, when using a SQL driver.
 
 ```SQL
 CREATE TABLE [dbo].[device_updated](
@@ -130,7 +130,7 @@ This sample was built on:
 
 To better understand the binding approach, it's recommended to follow [this tutorial](https://github.com/Azure/azure-functions-sql-extension#quick-start).
 
-First, create a default HttpTrigger function app by following this [tutorial](../azure-functions/create-first-function-vs-code-csharp.md?tabs=in-process). The following information will be used:
+First, create a default HttpTrigger function app by following this [tutorial](../azure-functions/create-first-function-vs-code-csharp.md?tabs=in-process). The following information is used:
 
 - Language: `C#`
 - Runtime: `.NET 6` (under function/runtime v4)
@@ -233,7 +233,7 @@ Update the `Device` class and mapping section to match your own schema:
         public DateTime Timestamp { get; set; }
 ```
 
-You can now test the wiring between the local function and the database by debugging (F5 in VS Code). The SQL database needs to be reachable from your machine. [SSMS](/sql/ssms/sql-server-management-studio-ssms) can be used to check connectivity. Then a tool like [Postman](https://www.postman.com/) can be used to issue POST requests to the local endpoint. A request with an empty body should return http 204. A request with an actual payload should be persisted in the destination table (in replace / update mode). Here's a sample payload corresponding to the schema used in this sample:
+You can now test the wiring between the local function and the database by debugging (F5 in Visual Studio Code). The SQL database needs to be reachable from your machine. [SSMS](/sql/ssms/sql-server-management-studio-ssms) can be used to check connectivity. Then a tool like [Postman](https://www.postman.com/) can be used to issue POST requests to the local endpoint. A request with an empty body should return http 204. A request with an actual payload should be persisted in the destination table (in replace / update mode). Here's a sample payload corresponding to the schema used in this sample:
 
 ```JSON
 [{"DeviceId":3,"Value":13.4,"Timestamp":"2021-11-30T03:22:12.991Z"},{"DeviceId":4,"Value":41.4,"Timestamp":"2021-11-30T03:22:12.991Z"}]
@@ -256,7 +256,7 @@ This sample was built on:
 - [.NET 6.0](/dotnet/core/whats-new/dotnet-6)
 - Microsoft.Data.SqlClient [4.0.0](https://www.nuget.org/packages/Microsoft.Data.SqlClient/)
 
-First, create a default HttpTrigger function app by following this [tutorial](../azure-functions/create-first-function-vs-code-csharp.md?tabs=in-process). The following information will be used:
+First, create a default HttpTrigger function app by following this [tutorial](../azure-functions/create-first-function-vs-code-csharp.md?tabs=in-process). The following information is used:
 
 - Language: `C#`
 - Runtime: `.NET 6` (under function/runtime v4)
@@ -371,11 +371,11 @@ The function can then be defined as an output in the ASA job, and used to replac
 
 ## Alternatives
 
-Outside of Azure Functions, there are multiple ways to achieve the expected result. We'll mention the most likely solutions below.
+Outside of Azure Functions, there are multiple ways to achieve the expected result. This section provides some of them. 
 
 ### Post-processing in the target SQL Database
 
-A background task will operate once the data is inserted in the database via the standard ASA SQL outputs.
+A background task operates once the data is inserted in the database via the standard ASA SQL outputs.
 
 For Azure SQL, `INSTEAD OF` [DML triggers](/sql/relational-databases/triggers/dml-triggers?view=azuresqldb-current&preserve-view=true) can be used to intercept the INSERT commands issued by ASA:
 
@@ -402,13 +402,13 @@ END;
 
 For Synapse SQL, ASA can insert into a [staging table](../synapse-analytics/sql/data-loading-best-practices.md#load-to-a-staging-table). A recurring task can then transform the data as needed into an intermediary table. Finally the [data is moved](../synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition.md#partition-switching) to the production table.
 
-### Pre-processing in Azure Cosmos DB
+### Preprocessing in Azure Cosmos DB
 
 Azure Cosmos DB [supports UPSERT natively](./stream-analytics-documentdb-output.md#upserts-from-stream-analytics). Here only append/replace is possible. Accumulations must be managed client-side in Azure Cosmos DB.
 
 If the requirements match, an option is to replace the target SQL database by an Azure Cosmos DB instance. Doing so requires an important change in the overall solution architecture.
 
-For Synapse SQL, Azure Cosmos DB can be used as an intermediary layer via [Azure Synapse Link for Azure Cosmos DB](../cosmos-db/synapse-link.md). Synapse Link can be used to create an [analytical store](../cosmos-db/analytical-store-introduction.md). This data store can then be queried directly in Synapse SQL.
+For Synapse SQL, Azure Cosmos DB can be used as an intermediary layer via [Azure Synapse Link for Azure Cosmos DB](../cosmos-db/synapse-link.md). Azure Synapse Link can be used to create an [analytical store](../cosmos-db/analytical-store-introduction.md). This data store can then be queried directly in Synapse SQL.
 
 ### Comparison of the alternatives
 
@@ -422,7 +422,7 @@ Each approach offers different value proposition and capabilities:
 |Pre-Processing|||||
 ||Azure Functions|Replace, Accumulate|+|- (row-by-row performance)|
 ||Azure Cosmos DB replacement|Replace|N/A|N/A|
-||Azure Cosmos DB Synapse Link|Replace|N/A|+|
+||Azure Cosmos DB Azure Synapse Link|Replace|N/A|+|
 
 ## Get support
 
diff --git a/articles/stream-analytics/stream-analytics-add-inputs.md b/articles/stream-analytics/stream-analytics-add-inputs.md
@@ -5,7 +5,7 @@ ms.service: stream-analytics
 author: enkrumah
 ms.author: ebnkruma
 ms.topic: conceptual
-ms.date: 02/28/2023
+ms.date: 02/26/2024
 ---
 # Understand inputs for Azure Stream Analytics
 
@@ -31,14 +31,14 @@ As data is pushed to a data source, it's consumed by the Stream Analytics job an
 - Reference data inputs.
 
 ### Data stream input
-A data stream is an unbounded sequence of events over time. Stream Analytics jobs must include at least one data stream input. Event Hubs, IoT Hub, Azure Data Lake Storage Gen2 and Blob storage are supported as data stream input sources. Event Hubs is used to collect event streams from multiple devices and services. These streams might include social media activity feeds, stock trade information, or data from sensors. IoT Hubs are optimized to collect data from connected devices in Internet of Things (IoT) scenarios.  Blob storage can be used as an input source for ingesting bulk data as a stream, such as log files.  
+A data stream is an unbounded sequence of events over time. Stream Analytics jobs must include at least one data stream input. Event Hubs, IoT Hub, Azure Data Lake Storage Gen2, and Blob storage are supported as data stream input sources. Event Hubs is used to collect event streams from multiple devices and services. These streams might include social media activity feeds, stock trade information, or data from sensors. IoT Hubs are optimized to collect data from connected devices in Internet of Things (IoT) scenarios  Blob storage can be used as an input source for ingesting bulk data as a stream, such as log files.  
 
-For more information about streaming data inputs, see [Stream data as input into Stream Analytics](stream-analytics-define-inputs.md)
+For more information about streaming data inputs, see [Stream data as input into Stream Analytics](stream-analytics-define-inputs.md).
 
 ### Reference data input
 Stream Analytics also supports input known as *reference data*. Reference data is either completely static or changes slowly. It's typically used to perform correlation and lookups. For example, you might join data in the data stream input to data in the reference data, much as you would perform a SQL join to look up static values. Azure Blob storage, Azure Data Lake Storage Gen2, and Azure SQL Database are currently supported as input sources for reference data. Reference data source blobs have a limit of up to 300 MB in size, depending on the query complexity and allocated Streaming Units. For more information, see the [Size limitation](stream-analytics-use-reference-data.md#size-limitation) section of the reference data documentation.
 
-For more information about reference data inputs, see [Using reference data for lookups in Stream Analytics](stream-analytics-use-reference-data.md)
+For more information about reference data inputs, see [Using reference data for lookups in Stream Analytics](stream-analytics-use-reference-data.md).
 
 ## Next steps
 > [!div class="nextstepaction"]
diff --git a/articles/stream-analytics/stream-analytics-use-reference-data.md b/articles/stream-analytics/stream-analytics-use-reference-data.md
diff --git a/articles/virtual-machines/automatic-extension-upgrade.md b/articles/virtual-machines/automatic-extension-upgrade.md