Skip to content

Commit 5345f96

Browse files
authored
Merge pull request #79549 from djpmsft/updateADFDataFlow
Editing Debug Mode Docs
2 parents 5b031f0 + 62f73af commit 5345f96

File tree

3 files changed

+14
-19
lines changed

3 files changed

+14
-19
lines changed

articles/data-factory/concepts-data-flow-debug-mode.md

Lines changed: 14 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -13,48 +13,43 @@ ms.date: 10/04/2018
1313

1414
[!INCLUDE [notes](../../includes/data-factory-data-flow-preview.md)]
1515

16-
Azure Data Factory Mapping Data Flow has a debug mode, which can be switched on with the Data Flow Debug button at the top of the design surface. When designing data flows, setting debug mode on will allow you to interactively watch the data shape transform while you build and debug your data flows. The Debug session can be used both in Data Flow design sessions as well as during pipeline debug execution of data flows.
16+
Azure Data Factory Mapping Data Flow's debug mode can be switched on with the "Data Flow Debug" button at the top of the design surface. When designing data flows, turning debug mode on will allow you to interactively watch the data shape transform while you build and debug your data flows. The Debug session can be used both in Data Flow design sessions as well as during pipeline debug execution of data flows.
1717

1818
![Debug button](media/data-flow/debugbutton.png "Debug button")
1919

2020
## Overview
21-
When Debug mode is on, you will interactively build your data flow with an active Spark cluster. The session will close once you turn debug off in Azure Data Factory. You should be aware of the hourly charges incurred by Azure Databricks during the time that you have the debug session turned on.
21+
When Debug mode is on, you'll interactively build your data flow with an active Spark cluster. The session will close once you turn debug off in Azure Data Factory. You should be aware of the hourly charges incurred by Azure Databricks during the time that you have the debug session turned on.
2222

23-
In most cases, it is a good practice to build your Data Flows in debug mode so that you can validate your business logic and view your data transformations before publishing your work in Azure Data Factory. You should also use the "Debug" button on the pipeline panel to test your data flow inside of a pipeline.
23+
In most cases, it's a good practice to build your Data Flows in debug mode so that you can validate your business logic and view your data transformations before publishing your work in Azure Data Factory. Use the "Debug" button on the pipeline panel to test your data flow inside of a pipeline.
2424

2525
> [!NOTE]
26-
> While the debug mode light is green on the Data Factory toolbar, you will be charged at the Data Flow debug rate of 8 cores/hr of general compute with a 60 minute time-to-live
27-
28-
## Debug mode on
29-
When you switch on debug mode, you will be prompted with a side-panel form that will request you to point to your interactive Azure Databricks cluster and select options for the source sampling. You must use an interactive cluster from Azure Databricks and select either a sampling size from each your Source transforms, or pick a text file to use for your test data.
30-
31-
<img src="media/data-flow/upload.png" width="400">
26+
> While the debug mode light is green on the Data Factory toolbar, you'll be charged at the Data Flow debug rate of 8 cores/hr of general compute with a 60 minute time-to-live
3227
3328
> [!NOTE]
34-
>When running in Debug Mode in Data Flow, your data will not be written to the Sink transform. A Debug session is intended to serve as a test >harness for your transformations. Sinks are not required during debug and are ignored in your data flow. If you wish to test writing the data >in your Sink, execute the Data Flow from an Azure Data Factory Pipeline and use the Debug execution from a pipeline.
29+
>When running in Debug Mode in Data Flow, your data will not be written to the Sink transform. A Debug session is intended to serve as a test harness for your transformations. Sinks are not required during debug and are ignored in your data flow. If you wish to test writing the data in your Sink, execute the Data Flow from an Azure Data Factory Pipeline and use the Debug execution from a pipeline.
3530
3631
## Debug settings
37-
Debug settings can be Each Source from your Data Flow will appear in the side panel and can also be edited by selecting "source settings" on the Data Flow designer toolbar. You can select the limits and/or file source to use for each your Source transformation here. The row limits in this setting are only for the current debug session. You can also use the Sampling setting in the source for limiting rows into the Source transformation.
32+
Debug settings can be edited by clicking "Debug Settings" on the Data Flow canvas toolbar. You can select the limits and/or file source to use for each of your Source transformations here. The row limits in this setting are only for the current debug session. You can also select the staging linked service to be used for a SQL DW source.
3833

39-
## Cluster status
40-
There is a cluster status indicator at the top of the design surface that will turn green when the cluster is ready for debug. If your cluster is already warm, then the green indicator will appear almost instantly. If your cluster was not already running when you entered debug mode, then you will have to wait 5-7 minutes for the cluster to spin up. The indicator light will be yellow until it is ready. Once your cluster is ready for Data Flow debug, the indicator light will turn green.
34+
![Debug settings](media/data-flow/debug-settings.png "Debug settings")
4135

42-
When you are finished with your debugging, turn the Debug switch off so that your Azure Databricks cluster can terminate and you will no longer be billed for debug activity.
36+
## Cluster status
37+
There's a cluster status indicator at the top of the design surface that will turn green when the cluster is ready for debug. If your cluster is already warm, then the green indicator will appear almost instantly. If your cluster wasn't already running when you entered debug mode, then you'll have to wait 5-7 minutes for the cluster to spin up. The indicator will spin until its ready.
4338

44-
<img src="media/data-flow/datapreview.png" width="400">
39+
When you are finished with your debugging, turn the Debug switch off so that your Azure Databricks cluster can terminate and you'll no longer be billed for debug activity.
4540

4641
## Data preview
4742
With debug on, the Data Preview tab will light-up on the bottom panel. Without debug mode on, Data Flow will show you only the current metadata in and out of each of your transformations in the Inspect tab. The data preview will only query the number of rows that you have set as your limit in your debug settings. You may need to click "Fetch data" to refresh the data preview.
4843

49-
<img src="media/data-flow/stats.png" width="400">
44+
![Data preview](media/data-flow/datapreview.png "Data preview")
5045

5146
## Data profiles
52-
Selecting individual columns in your data preview tab will pop-up a chart on the far-right of your data grid with detailed statistics about each field. Azure Data Factory will make a determination based upon the data sampling of which type of chart to display. High-cardinality fields will default to NULL / NOT NULL charts while categorical and numeric data that has low cardinality will display bar charts showing data value frequency. You will also see max / len length of string fields, min / max values in numeric fields, standard dev, percentiles, counts and average.
47+
Selecting individual columns in your data preview tab will pop up a chart on the far-right of your data grid with detailed statistics about each field. Azure Data Factory will make a determination based upon the data sampling of which type of chart to display. High-cardinality fields will default to NULL/NOT NULL charts while categorical and numeric data that has low cardinality will display bar charts showing data value frequency. You'll also see max/len length of string fields, min/max values in numeric fields, standard dev, percentiles, counts, and average.
5348

54-
<img src="media/data-flow/chart.png" width="400">
49+
![Column statistics](media/data-flow/stats.png "Column statistics")
5550

5651
## Next steps
5752

58-
Once you are finished building and debugging your data flow, [execute it from a pipeline.](control-flow-execute-data-flow-activity.md)
53+
Once you're finished building and debugging your data flow, [execute it from a pipeline.](control-flow-execute-data-flow-activity.md)
5954

6055
When testing your pipeline with a data flow, use the pipeline [Debug run execution option.](iterative-development-debugging.md)
13.1 KB
Loading
-23.9 KB
Binary file not shown.

0 commit comments

Comments
 (0)