edit pass: stream-analytics-troubleshoot-output

v-dihans · v-dihans · commit 03e41c3af383 · 2020-04-17T12:08:00.000-06:00
diff --git a/articles/stream-analytics/stream-analytics-troubleshoot-output.md b/articles/stream-analytics/stream-analytics-troubleshoot-output.md
@@ -12,84 +12,86 @@ ms.custom: seodec18
 
 # Troubleshoot Azure Stream Analytics outputs
 
-This article describes common issues with Azure Stream Analytics output connections, how to troubleshoot output issues, and how to correct the issues. Many troubleshooting steps require diagnostic logs to be enabled for your Stream Analytics job. If you do not have diagnostic logs enabled, see [Troubleshoot Azure Stream Analytics by using diagnostics logs](stream-analytics-job-diagnostic-logs.md).
+This article describes common issues with Azure Stream Analytics output connections and how to troubleshoot them. Many troubleshooting steps require that diagnostic logs be enabled for your Stream Analytics job. If you don't have diagnostic logs enabled, see [Troubleshoot Stream Analytics by using diagnostics logs](stream-analytics-job-diagnostic-logs.md).
 
-## Output not produced by job
+## The job doesn't produce output
 
-1.  Verify connectivity to outputs by using the **Test Connection** button for each output.
+If the job doesn't produce outputs, verify connectivity:
 
-2.  Look at [**Monitoring Metrics**](stream-analytics-monitoring.md) on the **Monitor** tab. Because the values are aggregated, the metrics are delayed by a few minutes.
-   * If Input Events are greater than 0, the job is able to read input data. If Input Events are not greater than 0, then there is an issue with the job's input. See [Troubleshoot input connections](stream-analytics-troubleshoot-input.md) to learn how to troubleshoot input connection issues.
-   * If Data Conversion Errors are greater than 0 and climbing, see [Azure Stream Analytics data errors](data-errors.md) for detailed information about data conversion errors.
-   * If Runtime Errors are greater than 0, your job can receive data but it's generating errors while processing the query. To find the errors, go to the [Audit Logs](../azure-resource-manager/management/view-activity-logs.md) and filter on *Failed* status.
-   * If InputEvents is greater than 0 and OutputEvents equals 0, one of the following is true:
-      * Query processing resulted in zero output events.
-      * Events or fields might be malformed, resulting in zero output after query processing.
+1. Verify connectivity to outputs using the **Test Connection** button for each output.
+1. Look at [Monitoring metrics](stream-analytics-monitoring.md) on the **Monitor** tab. Because the values are aggregated, the metrics are delayed by a few minutes.
+
+   * If Input Events are greater than zero, the job can read the input data. If Input Events are not greater than zero, there is an issue with the job's input. See [Troubleshoot input connections](stream-analytics-troubleshoot-input.md) for more information.
+   * If Data Conversion Errors are greater than zero and climbing, see [Azure Stream Analytics data errors](data-errors.md) for detailed information about data conversion errors.
+   * If Runtime Errors are greater than zero, your job can receive data but it's generating errors while processing the query. To find the errors, go to the [audit logs](../azure-resource-manager/management/view-activity-logs.md), and then filter on the *Failed* status.
+   * If InputEvents is greater than zero and OutputEvents equals zero, one of the following is true:
+      * The query processing resulted in zero output events.
+      * Events or fields might be malformed, resulting in a zero output after the query processing.
       * The job was unable to push data to the output sink for connectivity or authentication reasons.
 
-   In all the previously mentioned error cases, operations log messages explain additional details (including what is happening), except in cases where the query logic filtered out all events. If the processing of multiple events generates errors, the errors are aggregated every 10 minutes.
+   Operations log messages explain additional details, including what's happening, except in cases where the query logic filters out all events. If the processing of multiple events generates errors, the errors aggregate every 10 minutes.
+
+## The first output is delayed
 
-## Job output is delayed
+When a Stream Analytics job starts, the input events are read. But, there can be a delay in the output, in certain circumstances.
 
-### First output is delayed
+Large time values in temporal query elements can contribute to the output delay. To produce the correct output over large time windows, the streaming job reads data from the latest time possible to fill the time window. The data can be up to seven days past. No output produces until the outstanding input events are read. This problem can surface when the system upgrades the streaming jobs. When an upgrade takes place, the job restarts. Such upgrades generally occur once every couple of months.
 
-When a Stream Analytics job is started, the input events are read, but there can be a delay in the output being produced in certain circumstances.
+Use discretion when designing your Stream Analytics query. If you use a large time window for temporal elements in the job's query syntax, it can lead to a delay in the first output when the job starts or restarts. More than several hours, up to seven days, is considered a large time window.
 
-Large time values in temporal query elements can contribute to the output delay. To produce correct output over the large time windows, the streaming job starts up by reading data from the latest time possible (up to seven days ago) to fill the time window. During that time, no output is produced until the catch-up read of the outstanding input events is complete. This problem can surface when the system upgrades the streaming jobs, thus restarting the job. Such upgrades generally occur once every couple of months.
+One mitigation for this kind of first output delay is to use query parallelization techniques, such as partitioning the data. Or, you can add more Streaming Units to improve the throughput until the job catches up.  For more information, see [Considerations when creating Stream Analytics jobs](stream-analytics-concepts-checkpoint-replay.md).
 
-Therefore, use discretion when designing your Stream Analytics query. If you use a large time window (more than several hours, up to seven days) for temporal elements in the job's query syntax, it can lead to a delay on the first output when the job is started or restarted.  
+These factors impact the timeliness of the first output:
 
-One mitigation for this kind of first output delay is to use query parallelization techniques (partitioning the data), or add more Streaming Units to improve the throughput until the job catches up.  For more information, see [Considerations when creating Stream Analytics jobs](stream-analytics-concepts-checkpoint-replay.md)
+* The use of windowed aggregates, such as a GROUP BY clause of tumbling, hopping, and sliding windows:
 
-These factors impact the timeliness of the first output that is generated:
+  * For tumbling or hopping window aggregates, the results are generated at the end of the window timeframe.
+  * For a sliding window, the results are generated when an event enters or exits the sliding window.
+  * If you are planning to use a large window size, such as more than one hour, it’s best to choose a hopping or sliding window. These window types let you see the output more frequently.
 
-1. Use of windowed aggregates (GROUP BY of Tumbling, Hopping, and Sliding windows)
-   - For tumbling or hopping window aggregates, results are generated at the end of the window timeframe.
-   - For a sliding window, the results are generated when an event enters or exits the sliding window.
-   - If you are planning to use large window size (> 1 hour), it’s best to choose hopping or sliding window so that you can see the output more frequently.
+* The use of temporal joins, such as JOIN with DATEDIFF:
+  * Matches generate as soon as both sides of the matched events arrive.
+  * Data that lacks a match, like LEFT OUTER JOIN, generates at the end of the DATEDIFF window, with respect to each event on the left side.
 
-2. Use of temporal joins (JOIN with DATEDIFF)
-   - Matches are generated as soon as when both sides of the matched events arrive.
-   - Data that lacks a match (LEFT OUTER JOIN) is generated at the end of the DATEDIFF window with respect to each event on the left side.
+* The use of temporal analytic functions, such as ISFIRST, LAST, and LAG with LIMIT DURATION:
+  * For analytic functions, the output generates for every event. There is no delay.
 
-3. Use of temporal analytic functions (ISFIRST, LAST, and LAG with LIMIT DURATION)
-   - For analytic functions, the output is generated for every event, there is no delay.
+## The output falls behind
 
-### Output falls behind
+During the normal operation of a job, the output might have longer and longer periods of latency. If the output falls behind like that, you can pinpoint the root causes by examining the following factors:
 
-During normal operation of the job, if you find the job’s output is falling behind (longer and longer latency), you can pinpoint the root causes by examining these factors:
-- Whether the downstream sink is throttled
-- Whether the upstream source is throttled
-- Whether the processing logic in the query is compute-intensive
+* Whether the downstream sink is throttled.
+* Whether the upstream source is throttled.
+* Whether the processing logic in the query is compute-intensive.
 
-To see those details, in the Azure portal, select the streaming job, and select the **Job diagram**. For each input, there is a per partition backlog event metric. If the backlog event metric keeps increasing, it’s an indicator that the system resources are constrained. Potentially that is due to of output sink throttling, or high CPU. For more information on using the job diagram, see [Data-driven debugging by using the job diagram](stream-analytics-job-diagram-with-metrics.md).
+To see the output details, select the streaming job in the Azure portal, and then select the **Job diagram**. For each input, there is a per partition backlog event metric. If the metric keeps increasing, it’s an indicator that the system resources are constrained. The increase is potentially due to output sink throttling, or high CPU usage. For more information, see [Data-driven debugging by using the job diagram](stream-analytics-job-diagram-with-metrics.md).
 
-## Key violation warning with Azure SQL Database output
+## Key violation warning in Azure SQL Database output
 
-When you configure Azure SQL database as output to a Stream Analytics job, it bulk inserts records into the destination table. In general, Azure stream analytics guarantees [at least once delivery](https://docs.microsoft.com/stream-analytics-query/event-delivery-guarantees-azure-stream-analytics) to the output sink, one can still [achieve exactly-once delivery]( https://blogs.msdn.microsoft.com/streamanalytics/2017/01/13/how-to-achieve-exactly-once-delivery-for-sql-output/) to SQL output when SQL table has a unique constraint defined.
+When you configure an Azure SQL Database as output to a Stream Analytics job, it bulk inserts records into the destination table. In general, Stream Analytics guarantees [at least once delivery](https://docs.microsoft.com/stream-analytics-query/event-delivery-guarantees-azure-stream-analytics) to the output sink. You can still [achieve exactly-once delivery]( https://blogs.msdn.microsoft.com/streamanalytics/2017/01/13/how-to-achieve-exactly-once-delivery-for-sql-output/) to a SQL output when a SQL table has a unique constraint defined.
 
-Once unique key constraints are set up on the SQL table, and there are duplicate records being inserted into SQL table, Azure Stream Analytics removes the duplicate record. It splits the data into batches and recursively inserting the batches until a single duplicate record is found. If the streaming job has a considerable number of duplicate rows, this split and insert process has to ignore the duplicates one by one, which is less efficient and time-consuming. If you see multiple key violation warning messages in your Activity log within the past hour, it’s likely that your SQL output is slowing down the entire job.
+When unique key constraints are set up on the SQL table and duplicate records are inserted into the SQL table, Stream Analytics removes the duplicate records. It splits the data into batches and recursively inserts the batches until a single duplicate record is found. The split and insert process ignores the duplicates one at a time. For a streaming job that has a considerable number of duplicate rows, the process is inefficient and time-consuming. If you see multiple key violation warning messages in your Activity log for the previous hour, it’s likely that your SQL output is slowing down the entire job.
 
-To resolve this issue, you should [configure the index]( https://docs.microsoft.com/sql/t-sql/statements/create-index-transact-sql) that is causing the key violation by enabling the IGNORE_DUP_KEY option. Enabling this option allows duplicate values to be ignored by SQL during bulk inserts and SQL Azure simply produces a warning message instead of an error. Azure Stream Analytics does not produce primary key violation errors anymore.
+To resolve this issue, [configure the index]( https://docs.microsoft.com/sql/t-sql/statements/create-index-transact-sql) causing the key violation by enabling the IGNORE_DUP_KEY option. This option allows SQL to ignore duplicate values during bulk inserts. Azure SQL Database simply produces a warning message instead of an error. Stream Analytics doesn't produce primary key violation errors anymore.
 
-Note the following observations when configuring IGNORE_DUP_KEY for several types of indexes:
+Note the following observations when configuring the IGNORE_DUP_KEY for several types of indexes:
 
-* You cannot set IGNORE_DUP_KEY on a primary key or a unique constraint that uses ALTER INDEX, you need to drop and recreate the index.  
-* You can set the IGNORE_DUP_KEY option using ALTER INDEX for a unique index, which is different from PRIMARY KEY/UNIQUE constraint and created using CREATE INDEX or INDEX definition.  
+* You can't set the IGNORE_DUP_KEY on a primary key or a unique constraint that uses ALTER INDEX. You need to drop and recreate the index.  
+* You can't set the IGNORE_DUP_KEY option using ALTER INDEX for a unique index. This is different from a PRIMARY KEY/UNIQUE constraint and is created using a CREATE INDEX or INDEX definition.  
+* The IGNORE_DUP_KEY doesn’t apply to column store indexes because you can’t enforce uniqueness on them.  
 
-* IGNORE_DUP_KEY doesn’t apply to column store indexes because you can’t enforce uniqueness on such indexes.  
+## Column names are lower-case in Stream Analytics (1.0)
 
-## Column names are lower-cased by Azure Stream Analytics
-When using the original compatibility level (1.0), Azure Stream Analytics used to change column names to lower case. This behavior was fixed in later compatibility levels. In order to preserve the case, we advise customers to move to the compatibility level 1.1 and later. You can find more information on [Compatibility level for Azure Stream Analytics jobs](https://docs.microsoft.com/azure/stream-analytics/stream-analytics-compatibility-level).
+When using the original compatibility level (1.0), Stream Analytics changes column names to lower-case. This behavior was fixed in later compatibility levels. To preserve the case, move to compatibility level 1.1 or later. See [Compatibility level for Stream Analytics jobs](https://docs.microsoft.com/azure/stream-analytics/stream-analytics-compatibility-level) for more information.
 
 ## Get help
 
-For further assistance, try our [Azure Stream Analytics forum](https://social.msdn.microsoft.com/Forums/azure/home?forum=AzureStreamAnalytics).
+For further assistance, try our [Stream Analytics forum](https://social.msdn.microsoft.com/Forums/azure/home?forum=AzureStreamAnalytics).
 
 ## Next steps
 
-* [Introduction to Azure Stream Analytics](stream-analytics-introduction.md)
-* [Get started using Azure Stream Analytics](stream-analytics-real-time-fraud-detection.md)
-* [Scale Azure Stream Analytics jobs](stream-analytics-scale-jobs.md)
-* [Azure Stream Analytics Query Language Reference](https://docs.microsoft.com/stream-analytics-query/stream-analytics-query-language-reference)
-* [Azure Stream Analytics Management REST API Reference](https://msdn.microsoft.com/library/azure/dn835031.aspx)
+* [Introduction to Stream Analytics](stream-analytics-introduction.md)
+* [Get started using Stream Analytics](stream-analytics-real-time-fraud-detection.md)
+* [Scale Stream Analytics jobs](stream-analytics-scale-jobs.md)
+* [Stream Analytics Query Language Reference](https://docs.microsoft.com/stream-analytics-query/stream-analytics-query-language-reference)
+* [Stream Analytics Management REST API Reference](https://msdn.microsoft.com/library/azure/dn835031.aspx)