You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-relay/relay-what-is-it.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: What is Azure Relay? | Microsoft Docs
3
3
description: This article provides an overview of the Azure Relay service, which allows you to develop cloud applications that consume on-premises services running in your corporate network without opening a firewall connection or making intrusive changes to your network infrastructure.
Copy file name to clipboardExpand all lines: articles/stream-analytics/stream-analytics-introduction.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Learn about Azure Stream Analytics, a managed service that helps yo
4
4
ms.service: azure-stream-analytics
5
5
ms.topic: overview
6
6
ms.custom: mvc
7
-
ms.date: 01/25/2024
7
+
ms.date: 12/17/2024
8
8
#Customer intent: What is Azure Stream Analytics and why should I care? As an IT Pro or developer, how do I use Stream Analytics to perform analytics on data streams?
9
9
---
10
10
@@ -75,7 +75,7 @@ Azure Stream Analytics guarantees exactly once event processing and at-least-onc
75
75
76
76
Azure Stream Analytics has built-in recovery capabilities in case the delivery of an event fails. Stream Analytics also provides built-in checkpoints to maintain the state of your job and provides repeatable results.
77
77
78
-
Azure Stream Analytics supports Availability Zones for all jobs. Any new dedicated cluster or new job will automatically benefit from Availability Zones, and, in case of disaster in a zone, will continue to run seamlessly by failing over to the other zones without the need of any user action. Availability Zones provide customers with the ability to withstand datacenter failures through redundancy and logical isolation of services. This will significantly reduce the risk of outage for your streaming pipelines. Note that Azure Stream Analytics jobs integrated with VNET don't currently support Availability Zones.
78
+
Azure Stream Analytics supports Availability Zones for all jobs. Any new dedicated cluster or new job will automatically benefit from Availability Zones, and, in case of disaster in a zone, will continue to run seamlessly by failing over to the other zones without the need of any user action. Availability Zones provide customers with the ability to withstand datacenter failures through redundancy and logical isolation of services. This will significantly reduce the risk of outage for your streaming pipelines. Note that Azure Stream Analytics jobs integrated with virtual network don't currently support Availability Zones.
79
79
80
80
As a managed service, Stream Analytics guarantees event processing with a 99.9% availability at a minute level of granularity.
# Use query parallelization in Azure Stream Analytics
11
11
This article shows you how to take advantage of parallelization in Azure Stream Analytics. You learn how to scale Stream Analytics jobs by configuring input partitions and tuning the analytics query definition.
@@ -50,7 +50,7 @@ For more information about partitions, see the following articles:
50
50
51
51
### Query
52
52
53
-
For a job to be parallel, partition keys need to be aligned between all inputs, all query logic steps, and all outputs. The query logic partitioning is determined by the keys used for joins and aggregations (GROUP BY). This last requirement can be ignored if the query logic isn't keyed (projection, filters, referential joins...).
53
+
For a job to be parallel, partition keys need to be aligned between all inputs, all query logic steps, and all outputs. The query logic partitioning is determined by the keys used for joins and aggregations (GROUP BY). The last requirement can be ignored if the query logic isn't keyed (projection, filters, referential joins...).
54
54
55
55
* If an input and an output are partitioned by `WarehouseId`, and the query groups by `ProductId` without `WarehouseId`, then the job isn't parallel.
56
56
* If two inputs to be joined are partitioned by different partition keys (`WarehouseId` and `ProductId`), then the job isn't parallel.
@@ -63,9 +63,9 @@ Only when all inputs, outputs and query steps are using the same key, the job is
63
63
64
64
An *embarrassingly parallel* job is the most scalable scenario in Azure Stream Analytics. It connects one partition of the input to one instance of the query to one partition of the output. This parallelism has the following requirements:
65
65
66
-
- If your query logic depends on the same key being processed by the same query instance, you must make sure that the events go to the same partition of your input. For Event Hubs or IoT Hub, it means that the event data must have the **PartitionKey** value set. Alternatively, you can use partitioned senders. For blob storage, this means that the events are sent to the same partition folder. An example would be a query instance that aggregates data per userID where input event hub is partitioned using userID as partition key. However, if your query logic doesn't require the same key to be processed by the same query instance, you can ignore this requirement. An example of this logic would be a simple select-project-filter query.
66
+
- If your query logic depends on the same key being processed by the same query instance, you must make sure that the events go to the same partition of your input. For Event Hubs or IoT Hub, it means that the event data must have the **PartitionKey** value set. Alternatively, you can use partitioned senders. For blob storage, which means that the events are sent to the same partition folder. An example would be a query instance that aggregates data per userID where input event hub is partitioned using userID as partition key. However, if your query logic doesn't require the same key to be processed by the same query instance, you can ignore this requirement. An example of this logic would be a simple select-project-filter query.
67
67
- The next step is to make your query be partitioned. For jobs with compatibility level 1.2 or higher (recommended), custom column can be specified as Partition Key in the input settings and the job will be parallel automatically. Jobs with compatibility level 1.0 or 1.1, requires you to use **PARTITION BY PartitionId** in all the steps of your query. Multiple steps are allowed, but they all must be partitioned by the same key.
68
-
- Most of the outputs supported in Stream Analytics can take advantage of partitioning. If you use an output type that doesn't support partitioning your job won't be *embarrassingly parallel*. For Event Hubs output, ensure **Partition key column** is set to the same partition key used in the query. For more information, see [output section](#outputs).
68
+
- Most of the outputs supported in Stream Analytics can take advantage of partitioning. If you use an output type that doesn't support partitioning your job won't be *embarrassingly parallel*. For Event Hubs outputs, ensure **Partition key column** is set to the same partition key used in the query. For more information, see [output section](#outputs).
69
69
- The number of input partitions must equal the number of output partitions. Blob storage output can support partitions and inherits the partitioning scheme of the upstream query. When a partition key for Blob storage is specified, data is partitioned per input partition thus the result is still fully parallel. Here are examples of partition values that allow a fully parallel job:
@@ -266,7 +266,7 @@ This query can be scaled to 4 SU V2s.
266
266
267
267
An [embarrassingly parallel](#embarrassingly-parallel-jobs) job is necessary but not sufficient to sustain a higher throughput at scale. Every storage system, and its corresponding Stream Analytics output, has variations on how to achieve the best possible write throughput. As with any at-scale scenario, there are some challenges that can be solved by using the right configurations. This section discusses configurations for a few common outputs and provides samples for sustaining ingestion rates of 1 K, 5 K, and 10 K events per second.
268
268
269
-
The following observations use a Stream Analytics job with stateless (passthrough) query, a basic JavaScript UDF that writes to Event Hubs, Azure SQL, or Azure Cosmos DB.
269
+
The following observations use a Stream Analytics job with stateless (passthrough) query, a basic JavaScript user defined function (UDF) that writes to Event Hubs, Azure SQL, or Azure Cosmos DB.
0 commit comments