You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/time-series-insights/time-series-insights-update-storage-ingress.md
+24-21Lines changed: 24 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,25 +8,25 @@ ms.workload: big-data
8
8
ms.service: time-series-insights
9
9
services: time-series-insights
10
10
ms.topic: conceptual
11
-
ms.date: 02/07/2020
11
+
ms.date: 02/10/2020
12
12
ms.custom: seodec18
13
13
---
14
14
15
15
# Data storage and ingress in Azure Time Series Insights Preview
16
16
17
-
This article describes updates to data storage and ingress for Azure Time Series Insights Preview. It describes the underlying storage structure, file format, and Time Series ID property. It also discusses the underlying ingress process, best practices, and current preview limitations.
17
+
This article describes updates to data storage and ingress for Azure Time Series Insights Preview. It describes the underlying storage structure, file format, and Time Series ID property. The underlying ingress process, best practices, and current preview limitations are also described.
18
18
19
19
## Data ingress
20
20
21
21
Your Azure Time Series Insights environment contains an *ingestion engine* to collect, process, and store time-series data.
22
22
23
-
There are some considerations to take into account to ensure all incoming data is processed, to achieve high ingress scale, and minimize ingestion latency (the time taken by Time Series Insights to read and process data from the event source) when [planning your environment](time-series-insights-update-plan.md).
23
+
There are some considerations to be mindful of to ensure all incoming data is processed, to achieve high ingress scale, and minimize *ingestion latency* (the time taken by Time Series Insights to read and process data from the event source) when [planning your environment](time-series-insights-update-plan.md).
24
24
25
25
Time Series Insights Preview data ingress policies determine where data can be sourced from and what format the data should have.
26
26
27
27
### Ingress policies
28
28
29
-
Data ingress involves how data is sent to an Azure Time Series Insights Preview environment.
29
+
*Data ingress* involves how data is sent to an Azure Time Series Insights Preview environment.
30
30
31
31
Key configuration, formatting, and best practices are summarized below.
32
32
@@ -42,13 +42,13 @@ Azure Time Series Insights Preview supports a maximum of two event sources per i
42
42
> [!IMPORTANT]
43
43
> * You may experience high initial latency when attaching an event source to your Preview environment.
44
44
> Event source latency depends on the number of events currently in your IoT Hub or Event Hub.
45
-
> * High latency will subside after event source data is first ingested. Contact us by submitting a support ticket through the Azure portal if you experience continued high latency.
45
+
> * High latency will subside after event source data is first ingested. Submit a support ticket through the Azure portal if you experience ongoing high latency.
46
46
47
47
#### Supported data format and types
48
48
49
49
Azure Time Series Insights supports UTF-8 encoded JSON sent from Azure IoT Hub or Azure Event Hubs.
50
50
51
-
Below is the list of supported data types.
51
+
The supported data types are:
52
52
53
53
| Data type | Description |
54
54
|---|---|
@@ -59,9 +59,9 @@ Below is the list of supported data types.
59
59
60
60
#### Objects and arrays
61
61
62
-
You can send complex types such as objects and arrays as part of your event payload, but your data will undergo a flattening process when stored.
62
+
You may send complex types such as objects and arrays as part of your event payload, but your data will undergo a flattening process when stored.
63
63
64
-
Detailed information describing how to shape your JSON events, sending complex type, and nested object flattening is available in [How to shape JSON for ingress and query](./time-series-insights-update-how-to-shape-events.md).
64
+
Detailed information describing how to shape your JSON events, send complex type, and nested object flattening is available in [How to shape JSON for ingress and query](./time-series-insights-update-how-to-shape-events.md) to assist with planning and optimization.
65
65
66
66
### Ingress best practices
67
67
@@ -77,6 +77,9 @@ We recommend that you employ the following best practices:
77
77
78
78
Azure Time Series Insights Preview ingress limitations are described below.
79
79
80
+
> [!TIP]
81
+
> Read [Plan your Preview environment](https://docs.microsoft.com/azure/time-series-insights/time-series-insights-update-plan#review-preview-limits) for a comprehensive list of all Preview limits.
82
+
80
83
#### Per environment limitations
81
84
82
85
In general, ingress rates are viewed as the factor of the number of devices that are in your organization, event emission frequency, and the size of each event:
@@ -87,23 +90,23 @@ By default, Time Series Insights preview can ingest incoming data at a rate of *
87
90
88
91
> [!TIP]
89
92
> * Environment support for ingesting speeds up to 16 MBps can be provided by request.
90
-
> * Contact us if you require higher throughput by submitting a support ticket in the Azure portal.
93
+
> * Contact us if you require higher throughput by submitting a support ticket through Azure Portal.
91
94
92
95
***Example 1:**
93
96
94
97
Contoso Shipping has 100,000 devices that emit an event three times per minute. The size of an event is 200 bytes. They’re using an Event Hub with four partitions as the Time Series Insights event source.
95
98
96
-
* The ingestion rate for their Time Series Insights environment would be: 100,000 devices * 200 bytes/event * (3/60 event/sec) = 1 MBps.
99
+
* The ingestion rate for their Time Series Insights environment would be: **100,000 devices * 200 bytes/event * (3/60 event/sec) = 1 MBps**.
97
100
* The ingestion rate per partition would be 0.25 MBps.
98
101
* Contoso Shipping’s ingestion rate would be within the preview scale limitation.
99
102
100
103
***Example 2:**
101
104
102
105
Contoso Fleet Analytics has 60,000 devices that emit an event every second. They are using an IoT Hub 24 partition count of 4 as the Time Series Insights event source. The size of an event is 200 bytes.
103
106
104
-
* The environment ingestion rate would be: 20,000 devices * 200 bytes/event * 1 event/sec = 4 MBps.
107
+
* The environment ingestion rate would be: **20,000 devices * 200 bytes/event * 1 event/sec = 4 MBps**.
105
108
* The per partition rate would be 1 MBps.
106
-
* Contoso Fleet Analytics would need to submit a request to Time Series Insights via the Azure portal for a dedicated environment to achieve this scale.
109
+
* Contoso Fleet Analytics can submit a request to Time Series Insights through Azure Portal to increase the ingestion rate for their environment.
107
110
108
111
#### Hub partitions and per partition limits
109
112
@@ -122,20 +125,18 @@ Azure Time Series Insights Preview currently has a general **per partition limit
122
125
123
126
#### IoT Hub-specific considerations
124
127
125
-
When a device is created in IoT Hub, it is permanently assigned to a partition. In doing so, IoT Hub is able to guarantee event ordering (since the assignment never changes).
126
-
127
-
This has implications for Time Series Insights instances that are ingesting data sent from IoT Hub downstream.
128
+
When a device is created in IoT Hub, it's permanently assigned to a partition. In doing so, IoT Hub is able to guarantee event ordering (since the assignment never changes).
128
129
129
-
When messages from multiple devices are forwarded to the hub using the same gateway device ID, they may arrive in the same partition at the same time potentially exceeding the per partition scale limits.
130
+
A fixed partition assignment also impacts Time Series Insights instances that are ingesting data sent from IoT Hub downstream. When messages from multiple devices are forwarded to the hub using the same gateway device ID, they may arrive in the same partition at the same time potentially exceeding the per partition scale limits.
130
131
131
132
**Impact**:
132
133
133
-
* If a single partition experiences a sustained rate of ingestion over the Preview limit, there is the potential that the Time Series Insights reader will not ever catch up before the IoT Hub data retention period has been exceeded. This would cause a loss of data.
134
+
* If a single partition experiences a sustained rate of ingestion over the Preview limit, it's possible that Time Series Insights will not sync all device telemetry before the IoT Hub data retention period has been exceeded. As a result, sent data can be lost if the ingestion limits are consistently exceeded.
134
135
135
-
We recommend the following:
136
+
To mitigate that circumstance, we recommend the following best practices:
136
137
137
-
* Calculate your per environment and per partition ingestion rate before deploying your solution.
138
-
* Ensure that your IoT Hub devices (and thus partitions) are load-balanced to the furthest extend possible.
138
+
* Calculate your per environment and per partition ingestion rates before deploying your solution.
139
+
* Ensure that your IoT Hub devices are load-balanced to the furthest extent possible.
139
140
140
141
> [!IMPORTANT]
141
142
> For environments using IoT Hub as an event source, calculate the ingestion rate using the number of hub devices in use to be sure that the rate falls below the 0.5 MBps per partition limitation in preview.
@@ -185,7 +186,9 @@ Azure Time Series Insights Preview publishes up to two copies of each event in y
185
186
> [!NOTE]
186
187
> You can also use Spark, Hadoop, and other familiar tools to process the raw Parquet files.
187
188
188
-
Time Series Insights Preview also re-partitions the Parquet files to optimize for the Time Series Insights query. This repartitioned copy of the data is also saved. During public review, data is stored indefinitely in your Azure Storage account.
189
+
Time Series Insights Preview also repartitions the Parquet files to optimize for the Time Series Insights query. This repartitioned copy of the data is also saved.
190
+
191
+
During public Preview, data is stored indefinitely in your Azure Storage account.
189
192
190
193
#### Writing and editing Time Series Insights blobs
0 commit comments