You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this solution, you learn how to use Azure Stream Analytics to get real-time insights from your data. Developers can easily combine streams of data, such as click-streams, logs, and device-generated events, with historical records or reference data to derive business insights. As a fully managed, real-time stream computation service that's hosted in Microsoft Azure, Azure Stream Analytics provides built-in resiliency, low latency, and scalability to get you up and running in minutes.
16
16
17
-
After completing this solution, you are able to:
17
+
After completing this solution, you're able to:
18
18
19
19
* Familiarize yourself with the Azure Stream Analytics portal.
20
20
* Configure and deploy a streaming job.
@@ -27,16 +27,17 @@ You need the following prerequisites to complete this solution:
27
27
* An [Azure subscription](https://azure.microsoft.com/pricing/free-trial/)
28
28
29
29
## Scenario introduction: "Hello, Toll!"
30
-
A toll station is a common phenomenon. You encounter them on many expressways, bridges, and tunnels across the world. Each toll station has multiple toll booths. At manual booths, you stop to pay the toll to an attendant. At automated booths, a sensor on top of each booth scans an RFID card that's affixed to the windshield of your vehicle as you pass the toll booth. It is easy to visualize the passage of vehicles through these toll stations as an event stream over which interesting operations can be performed.
30
+
A toll station is a common phenomenon. You encounter them on many expressways, bridges, and tunnels across the world. Each toll station has multiple toll booths. At manual booths, you stop to pay the toll to an attendant. At automated booths, a sensor on top of each booth scans an RFID card that's affixed to the windshield of your vehicle as you pass the toll booth. It's easy to visualize the passage of vehicles through these toll stations as an event stream over which interesting operations can be performed.
31
31
32
32

33
33
34
34
## Incoming data
35
35
This solution works with two streams of data. Sensors installed in the entrance and exit of the toll stations produce the first stream. The second stream is a static lookup dataset that has vehicle registration data.
36
36
37
37
### Entry data stream
38
-
The entry data stream contains information about cars as they enter toll stations. The exit data events are live streamed into an Event Hub queue from a Web App included in the sample app.
38
+
The entry data stream contains information about cars as they enter toll stations. The exit data events are live streamed into an event hub from a Web App included in the sample app.
39
39
40
+
```
40
41
| TollID | EntryTime | LicensePlate | State | Make | Model | VehicleType | VehicleWeight | Toll | Tag |
@@ -62,7 +64,7 @@ Here is a short description of the columns:
62
64
| Tag |The e-Tag on the automobile that automates payment; blank where the payment was done manually |
63
65
64
66
### Exit data stream
65
-
The exit data stream contains information about cars leaving the toll station. The exit data events are live streamed into an Event Hub queue from a Web App included in the sample app.
67
+
The exit data stream contains information about cars leaving the toll station. The exit data events are live streamed into an event hub from a Web App included in the sample app.
66
68
67
69
|**TollId**|**ExitTime**|**LicensePlate**|
68
70
| --- | --- | --- |
@@ -73,7 +75,7 @@ The exit data stream contains information about cars leaving the toll station. T
73
75
| 1 |2014-09-10T12:08:00.0000000Z |BNJ 1007 |
74
76
| 2 |2014-09-10T12:07:00.0000000Z |CDE 1007 |
75
77
76
-
Here is a short description of the columns:
78
+
Here's a short description of the columns:
77
79
78
80
| Column | Description |
79
81
| --- | --- |
@@ -93,7 +95,7 @@ The solution uses a static snapshot of a commercial vehicle registration databas
93
95
| SNY 7188 |592133890 |0 |
94
96
| ELH 9896 |678427724 |1 |
95
97
96
-
Here is a short description of the columns:
98
+
Here's a short description of the columns:
97
99
98
100
| Column | Description |
99
101
| --- | --- |
@@ -102,7 +104,7 @@ Here is a short description of the columns:
102
104
| Expired |The registration status of the vehicle: 0 if vehicle registration is active, 1 if registration is expired |
103
105
104
106
## Set up the environment for Azure Stream Analytics
105
-
To complete this solution, you need a Microsoft Azure subscription. If you do not have an Azure account, you can [request a free trial version](https://azure.microsoft.com/pricing/free-trial/).
107
+
To complete this solution, you need a Microsoft Azure subscription. If you don't have an Azure account, you can [request a free trial version](https://azure.microsoft.com/pricing/free-trial/).
106
108
107
109
Be sure to follow the steps in the "Clean up your Azure account" section at the end of this article so that you can make the best use of your Azure credit.
108
110
@@ -120,7 +122,7 @@ There are several resources that can easily be deployed in a resource group toge
120
122
121
123
5. Select an Azure location.
122
124
123
-
6. Specify an **Interval** as a number of seconds. This value is used in the sample web app, for how frequently to send data into Event Hub.
125
+
6. Specify an **Interval** as a number of seconds. This value is used in the sample web app, for how frequently to send data into an event hub.
124
126
125
127
7.**Check** to agree to the terms and conditions.
126
128
@@ -140,11 +142,11 @@ There are several resources that can easily be deployed in a resource group toge
140
142
- One Azure Cosmos DB Account
141
143
- One Azure Stream Analytics Job
142
144
- One Azure Storage Account
143
-
- One Azure Event Hub
145
+
- One Azure event hub
144
146
- Two Web Apps
145
147
146
148
## Examine the sample TollApp job
147
-
1. Starting from the resource group in the previous section, select the Stream Analytics streaming job starting with the name **tollapp** (name contains random characters for uniqueness).
149
+
1. Starting from the resource group in the previous section, select the Stream Analytics streaming job starting with the name `tollapp` (name contains random characters for uniqueness).
148
150
149
151
2. On the **Overview** page of the job, notice the **Query** box to view the query syntax.
150
152
@@ -160,8 +162,8 @@ There are several resources that can easily be deployed in a resource group toge
160
162
As you can see, Azure Stream Analytics uses a query language that's like SQL and adds a few extensions to specify time-related aspects of the query. For more details, read about [Time Management](/stream-analytics-query/time-management-azure-stream-analytics) and [Windowing](/stream-analytics-query/windowing-azure-stream-analytics) constructs used in the query.
161
163
162
164
3. Examine the Inputs of the TollApp sample job. Only the EntryStream input is used in the current query.
163
-
-**EntryStream** input is an Event Hub connection that queues data representing each time a car enters a tollbooth on the highway. A web app that is part of the sample is creating the events, and that data is queued in this Event Hub. Note that this input is queried in the FROM clause of the streaming query.
164
-
-**ExitStream** input is an Event Hub connection that queues data representing each time a car exits a tollbooth on the highway. This streaming input is used in later variations of the query syntax.
165
+
-**EntryStream** input is an event hub connection that queues data representing each time a car enters a tollbooth on the highway. A web app that is part of the sample is creating the events, and that data is queued in this event hub. Note that this input is queried in the FROM clause of the streaming query.
166
+
-**ExitStream** input is an event hub connection that queues data representing each time a car exits a tollbooth on the highway. This streaming input is used in later variations of the query syntax.
165
167
-**Registration** input is an Azure Blob storage connection, pointing to a static registration.json file, used for lookups as needed. This reference data input is used in later variations of the query syntax.
166
168
167
169
4. Examine the Outputs of the TollApp sample job.
@@ -185,11 +187,11 @@ Follow these steps to start the streaming job:
185
187
186
188
4. Expand the **tollAppDatabase** > **tollAppCollection** > **Documents**.
187
189
188
-
5. In the list of ids, several docs are shown once the output is available.
190
+
5. In the list of IDs, several docs are shown once the output is available.
189
191
190
-
6. Select each id to review the JSON document. Notice each tollid, windowend time, and the count of cars from that window.
192
+
6. Select each ID to review the JSON document. Notice each `tollid`, `windowend time`, and the `count of cars` from that window.
191
193
192
-
7. After an additional three minutes, another set of four documents is available, one document per tollid.
194
+
7. After an additional three minutes, another set of four documents is available, one document per `tollid`.
193
195
194
196
195
197
## Report total time for each car
@@ -225,7 +227,7 @@ AND DATEDIFF (minute, EntryStream, ExitStream ) BETWEEN 0 AND 15
225
227
### Review the total time in the output
226
228
Repeat the steps in the preceding section to review the Azure Cosmos DB output data from the streaming job. Review the latest JSON documents.
227
229
228
-
For example, this document shows an example car with a certain license plate, the entrytime and exit time, and the DATEDIFF calculated durationinminutes field showing the toll booth duration as two minutes:
230
+
For example, this document shows an example car with a certain license plate, the `entrytime` and `exit time`, and the DATEDIFF calculated `durationinminutes` field showing the toll booth duration as two minutes:
229
231
```JSON
230
232
{
231
233
"tollid": 4,
@@ -299,7 +301,7 @@ To scale up the streaming job to more streaming units:
299
301
300
302
4. Slide the **Streaming units** slider from 1 to 6. Streaming units define the amount of compute power that the job can receive. Select **Save**.
301
303
302
-
5.**Start** the streaming job to demonstrate the additional scale. Azure Stream Analytics distributes work across more compute resources and achieve better throughput, partitioning the work across resources using the column designated in the PARTITION BY clause.
304
+
5.**Start** the streaming job to demonstrate the additional scale. Azure Stream Analytics distributes work across more compute resources and achieves better throughput, partitioning the work across resources using the column designated in the PARTITION BY clause.
303
305
304
306
## Monitor the job
305
307
The **MONITOR** area contains statistics about the running job. First-time configuration is needed to use the storage account in the same region (name toll like the rest of this document).
@@ -316,6 +318,6 @@ You can access **Activity Logs** from the job dashboard **Settings** area as wel
316
318
3. Select **Delete resource group**. Type the name of the resource group to confirm deletion.
317
319
318
320
## Conclusion
319
-
This solution introduced you to the Azure Stream Analytics service. It demonstrated how to configure inputs and outputs for the Stream Analytics job. Using the Toll Data scenario, the solution explained common types of problems that arise in the space of data in motion and how they can be solved with simple SQL-like queries in Azure Stream Analytics. The solution described SQL extension constructs for working with temporal data. It showed how to join data streams, how to enrich the data stream with static reference data, and how to scale out a query to achieve higher throughput.
321
+
This solution introduced you to the Azure Stream Analytics service. It demonstrated how to configure inputs and outputs for the Stream Analytics job. By using the Toll Data scenario, the solution explained common types of problems that arise in the space of data in motion and how they can be solved with simple SQL-like queries in Azure Stream Analytics. The solution described SQL extension constructs for working with temporal data. It showed how to join data streams, how to enrich the data stream with static reference data, and how to scale out a query to achieve higher throughput.
320
322
321
-
Although this solution provides a good introduction, it is not complete by any means. You can find more query patterns using the SAQL language at [Query examples for common Stream Analytics usage patterns](stream-analytics-stream-analytics-query-patterns.md).
323
+
Although this solution provides a good introduction, it isn't complete by any means. You can find more query patterns using the SAQL language at [Query examples for common Stream Analytics usage patterns](stream-analytics-stream-analytics-query-patterns.md).
description: This article describes the custom DateTime path patterns and the custom field or attributes features for blob storage output from Azure Stream Analytics jobs.
4
4
author: an-emma
5
-
ms.author: an-emma
5
+
ms.author: raan
6
6
ms.service: stream-analytics
7
7
ms.topic: conceptual
8
-
ms.date: 05/30/2021
8
+
ms.date: 02/15/2023
9
9
ms.custom: seodec18
10
10
---
11
11
@@ -19,13 +19,13 @@ Custom field or input attributes improve downstream data-processing and reportin
19
19
20
20
### Partition key options
21
21
22
-
The partition key, or column name, used to partition input data may contain any character that is accepted for [blob names](/rest/api/storageservices/Naming-and-Referencing-Containers--Blobs--and-Metadata). It is not possible to use nested fields as a partition key unless used in conjunction with aliases, but you can use certain characters to create a hierarchy of files. For example, you can use the following query to create a column that combines data from two other columns to make a unique partition key.
22
+
The partition key, or column name, used to partition input data may contain any character that is accepted for [blob names](/rest/api/storageservices/Naming-and-Referencing-Containers--Blobs--and-Metadata). It isn't possible to use nested fields as a partition key unless used in conjunction with aliases, but you can use certain characters to create a hierarchy of files. For example, you can use the following query to create a column that combines data from two other columns to make a unique partition key.
23
23
24
24
```sql
25
25
SELECT name, id, CONCAT(name, "/", id) AS nameid
26
26
```
27
27
28
-
The partition key must be NVARCHAR(MAX), BIGINT, FLOAT, or BIT (1.2 compatibility level or higher). DateTime, Array, and Records types are not supported, but could be used as partition keys if they are converted to Strings. For more information, see [Azure Stream Analytics Data types](/stream-analytics-query/data-types-azure-stream-analytics).
28
+
The partition key must be NVARCHAR(MAX), BIGINT, FLOAT, or BIT (1.2 compatibility level or higher). DateTime, Array, and Records types aren't supported, but could be used as partition keys if they're converted to Strings. For more information, see [Azure Stream Analytics Data types](/stream-analytics-query/data-types-azure-stream-analytics).
29
29
30
30
### Example
31
31
@@ -36,15 +36,15 @@ Suppose a job takes input data from live user sessions connected to an external
36
36
Similarly, if the job input was sensor data from millions of sensors where each sensor had a **sensor_id**, the Path Pattern would be **{sensor_id}** to partition each sensor data to different folders.
37
37
38
38
39
-
Using the REST API, the output section of a JSON file used for that request may look like the following:
39
+
When you use the REST API, the output section of a JSON file used for that request may look like the following image:
40
40
41
41

42
42
43
-
Once the job starts running, the *clients* container may look like the following:
43
+
Once the job starts running, the `clients` container may look like the following image:
Each folder may contain multiple blobs where each blob contains one or more records. In the above example, there is a single blob in a folder labeled "06000000" with the following contents:
47
+
Each folder may contain multiple blobs where each blob contains one or more records. In the above example, there's a single blob in a folder labeled "06000000" with the following contents:
@@ -61,11 +61,11 @@ Notice that each record in the blob has a **client_id** column matching the fold
61
61
62
62
2. If customers want to use more than one input field, they can create a composite key in query for custom path partition in blob output by using **CONCAT**. For example: **select concat (col1, col2) as compositeColumn into blobOutput from input**. Then they can specify **compositeColumn** as the custom path in blob storage.
63
63
64
-
3. Partition keys are case insensitive, so partition keys like "John" and "john" are equivalent. Also, expressions cannot be used as partition keys. For example, **{columnA + columnB}**does not work.
64
+
3. Partition keys are case insensitive, so partition keys like `John` and `john` are equivalent. Also, expressions can't be used as partition keys. For example, **{columnA + columnB}**doesn't work.
65
65
66
-
4. When an input stream consists of records with a partition key cardinality under 8000, the records will be appended to existing blobs and only create new blobs when necessary. If the cardinality is over 8000 there is no guarantee existing blobs will be written to and new blobs won't be created for an arbitrary number of records with the same partition key.
66
+
4. When an input stream consists of records with a partition key cardinality under 8000, the records are appended to existing blobs, and only create new blobs when necessary. If the cardinality is over 8000, there's no guarantee existing blobs will be written to, and new blobs won't be created for an arbitrary number of records with the same partition key.
67
67
68
-
5. If the blob output is [configured as immutable](../storage/blobs/immutable-storage-overview.md), Stream Analytics will create a new blob each time data is sent.
68
+
5. If the blob output is [configured as immutable](../storage/blobs/immutable-storage-overview.md), Stream Analytics creates a new blob each time data is sent.
69
69
70
70
## Custom DateTime path patterns
71
71
@@ -87,7 +87,7 @@ The following format specifier tokens can be used alone or in combination to ach
87
87
|{datetime:m}|Minutes from 0 to 60|6|
88
88
|{datetime:ss}|Seconds from 00 to 60|08|
89
89
90
-
If you do not wish to use custom DateTime patterns, you can add the {date} and/or {time} token to the Path Prefix to generate a dropdown with built-in DateTime formats.
90
+
If you don't wish to use custom DateTime patterns, you can add the {date} and/or {time} token to the Path Prefix to generate a dropdown with built-in DateTime formats.
91
91
92
92

0 commit comments