|
| 1 | +--- |
| 2 | +title: Monitor copy activity |
| 3 | +description: Learn about how to monitor the copy activity execution in Azure Data Factory. |
| 4 | +services: data-factory |
| 5 | +documentationcenter: '' |
| 6 | +author: linda33wj |
| 7 | +manager: shwang |
| 8 | +ms.reviewer: douglasl |
| 9 | + |
| 10 | +ms.service: data-factory |
| 11 | +ms.workload: data-services |
| 12 | +ms.topic: conceptual |
| 13 | +ms.date: 03/11/2020 |
| 14 | +ms.author: jingwang |
| 15 | + |
| 16 | +--- |
| 17 | +# Monitor copy activity |
| 18 | + |
| 19 | +This article outlines how to monitor the copy activity execution in Azure Data Factory. It builds on the [copy activity overview](copy-activity-overview.md) article that presents a general overview of copy activity. |
| 20 | + |
| 21 | +## Monitor visually |
| 22 | + |
| 23 | +Once you've created and published a pipeline in Azure Data Factory, you can associate it with a trigger or manually kick off an ad hoc run. You can monitor all of your pipeline runs natively in the Azure Data Factory user experience. Learn about Azure Data Factory monitoring in general from [Visually monitor Azure Data Factory](monitor-visually.md). |
| 24 | + |
| 25 | +To monitor the Copy activity run, go to your data factory **Author & Monitor** UI. On the **Monitor** tab, you see a list of pipeline runs, click the **pipeline name** link to access the list of activity runs in the pipeline run. |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | +At this level, you can see links to copy activity input, output, and errors (if the Copy activity run fails), as well as statistics like duration/status. Clicking the **Details** button (eyeglasses) next to the copy activity name will give you deep details on your copy activity execution. |
| 30 | + |
| 31 | + |
| 32 | + |
| 33 | +In this graphical monitoring view, Azure Data Factory presents you the copy activity execution information, including data read/written volume, number of files/rows of data copied from source to sink, throughput, the configurations applied for your copy scenario, steps the copy activity goes through with corresponding durations and details, and more. Refer to [this table](#monitor-programmatically) on each possible metric and its detailed description. |
| 34 | + |
| 35 | +In some scenarios, when you run a Copy activity in Data Factory, you'll see **"Performance tuning tips"** at the top of the copy activity monitoring view as shown in the example. The tips tell you the bottleneck identified by ADF for the specific copy run, along with suggestion on what to change to boost copy throughput. Learn more about [auto performance tuning tips](copy-activity-performance-troubleshooting.md#performance-tuning-tips). |
| 36 | + |
| 37 | +The bottom **execution details and durations** describes the key steps your copy activity goes through, which is especially useful for troubleshooting the copy performance. The bottleneck of your copy run is the one with the longest duration. Refer to [Troubleshoot copy activity performance](copy-activity-performance-troubleshooting.md) on for what each stage represents and the detailed troubleshooting guidance. |
| 38 | + |
| 39 | +**Example: Copy from Amazon S3 to Azure Data Lake Storage Gen2** |
| 40 | + |
| 41 | + |
| 42 | + |
| 43 | +## Monitor programmatically |
| 44 | + |
| 45 | +Copy activity execution details and performance characteristics are also returned in the **Copy Activity run result** > **Output** section, which is used to render the UI monitoring view. Following is a complete list of properties that might be returned. You'll see only the properties that are applicable to your copy scenario. For information about how to monitor activity runs programmatically in general, see [Programmatically monitor an Azure data factory](monitor-programmatically.md). |
| 46 | + |
| 47 | +| Property name | Description | Unit in output | |
| 48 | +|:--- |:--- |:--- | |
| 49 | +| dataRead | The actual amount of data read from the source. | Int64 value, in bytes | |
| 50 | +| dataWritten | The actual mount of data written/committed to the sink. The size may be different from `dataRead` size, as it relates how each data store stores the data. | Int64 value, in bytes | |
| 51 | +| filesRead | The number of files read from the file-based source. | Int64 value (no unit) | |
| 52 | +| filesWritten | The number of files written/committed to the file-based sink. | Int64 value (no unit) | |
| 53 | +| sourcePeakConnections | Peak number of concurrent connections established to the source data store during the Copy activity run. | Int64 value (no unit) | |
| 54 | +| sinkPeakConnections | Peak number of concurrent connections established to the sink data store during the Copy activity run. | Int64 value (no unit) | |
| 55 | +| rowsRead | Number of rows read from the source (not applicable for binary copy). | Int64 value (no unit) | |
| 56 | +| rowsCopied | Number of rows copied to sink (not applicable for binary copy). | Int64 value (no unit) | |
| 57 | +| rowsSkipped | Number of incompatible rows that were skipped. You can enable incompatible rows to be skipped by setting `enableSkipIncompatibleRow` to true. | Int64 value (no unit) | |
| 58 | +| copyDuration | Duration of the copy run. | Int32 value, in seconds | |
| 59 | +| throughput | Rate of data transfer. | Floating point number, in KBps | |
| 60 | +| sourcePeakConnections | Peak number of concurrent connections established to the source data store during the Copy activity run. | Int32 value (no unit) | |
| 61 | +| sinkPeakConnections| Peak number of concurrent connections established to the sink data store during the Copy activity run.| Int32 value (no unit) | |
| 62 | +| sqlDwPolyBase | Whether PolyBase is used when data is copied into SQL Data Warehouse. | Boolean | |
| 63 | +| redshiftUnload | Whether UNLOAD is used when data is copied from Redshift. | Boolean | |
| 64 | +| hdfsDistcp | Whether DistCp is used when data is copied from HDFS. | Boolean | |
| 65 | +| effectiveIntegrationRuntime | The integration runtime (IR) or runtimes used to power the activity run, in the format `<IR name> (<region if it's Azure IR>)`. | Text (string) | |
| 66 | +| usedDataIntegrationUnits | The effective Data Integration Units during copy. | Int32 value | |
| 67 | +| usedParallelCopies | The effective parallelCopies during copy. | Int32 value | |
| 68 | +| redirectRowPath | Path to the log of skipped incompatible rows in the blob storage you configure in the `redirectIncompatibleRowSettings` property. See [Fault tolerance](copy-activity-overview.md#fault-tolerance). | Text (string) | |
| 69 | +| executionDetails | More details on the stages the Copy activity goes through and the corresponding steps, durations, configurations, and so on. We don't recommend that you parse this section because it might change. To better understand how it helps you understand and troubleshoot copy performance, refer to [Monitor visually](#monitor-visually) section. | Array | |
| 70 | +| perfRecommendation | Copy performance tuning tips. See [Performance tuning tips](copy-activity-performance-troubleshooting.md#performance-tuning-tips) for details. | Array | |
| 71 | + |
| 72 | +**Example:** |
| 73 | + |
| 74 | +```json |
| 75 | +"output": { |
| 76 | + "dataRead": 1180089300500, |
| 77 | + "dataWritten": 1180089300500, |
| 78 | + "filesRead": 110, |
| 79 | + "filesWritten": 110, |
| 80 | + "sourcePeakConnections": 640, |
| 81 | + "sinkPeakConnections": 1024, |
| 82 | + "copyDuration": 388, |
| 83 | + "throughput": 2970183, |
| 84 | + "errors": [], |
| 85 | + "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US)", |
| 86 | + "usedDataIntegrationUnits": 128, |
| 87 | + "billingReference": "{\"activityType\":\"DataMovement\",\"billableDuration\":[{\"Managed\":11.733333333333336}]}", |
| 88 | + "usedParallelCopies": 64, |
| 89 | + "executionDetails": [ |
| 90 | + { |
| 91 | + "source": { |
| 92 | + "type": "AmazonS3" |
| 93 | + }, |
| 94 | + "sink": { |
| 95 | + "type": "AzureBlobFS", |
| 96 | + "region": "East US", |
| 97 | + "throttlingErrors": 6 |
| 98 | + }, |
| 99 | + "status": "Succeeded", |
| 100 | + "start": "2020-03-04T02:13:25.1454206Z", |
| 101 | + "duration": 388, |
| 102 | + "usedDataIntegrationUnits": 128, |
| 103 | + "usedParallelCopies": 64, |
| 104 | + "profile": { |
| 105 | + "queue": { |
| 106 | + "status": "Completed", |
| 107 | + "duration": 2 |
| 108 | + }, |
| 109 | + "transfer": { |
| 110 | + "status": "Completed", |
| 111 | + "duration": 386, |
| 112 | + "details": { |
| 113 | + "listingSource": { |
| 114 | + "type": "AmazonS3", |
| 115 | + "workingDuration": 0 |
| 116 | + }, |
| 117 | + "readingFromSource": { |
| 118 | + "type": "AmazonS3", |
| 119 | + "workingDuration": 301 |
| 120 | + }, |
| 121 | + "writingToSink": { |
| 122 | + "type": "AzureBlobFS", |
| 123 | + "workingDuration": 335 |
| 124 | + } |
| 125 | + } |
| 126 | + } |
| 127 | + }, |
| 128 | + "detailedDurations": { |
| 129 | + "queuingDuration": 2, |
| 130 | + "transferDuration": 386 |
| 131 | + } |
| 132 | + } |
| 133 | + ], |
| 134 | + "perfRecommendation": [ |
| 135 | + { |
| 136 | + "Tip": "6 write operations were throttled by the sink data store. To achieve better performance, you are suggested to check and increase the allowed request rate for Azure Data Lake Storage Gen2, or reduce the number of concurrent copy runs and other data access, or reduce the DIU or parallel copy.", |
| 137 | + "ReferUrl": "https://go.microsoft.com/fwlink/?linkid=2102534 ", |
| 138 | + "RuleName": "ReduceThrottlingErrorPerfRecommendationRule" |
| 139 | + } |
| 140 | + ], |
| 141 | + "durationInQueue": { |
| 142 | + "integrationRuntimeQueue": 0 |
| 143 | + } |
| 144 | +} |
| 145 | +``` |
| 146 | + |
| 147 | +## Next steps |
| 148 | +See the other Copy Activity articles: |
| 149 | + |
| 150 | +\- [Copy activity overview](copy-activity-overview.md) |
| 151 | + |
| 152 | +\- [Copy activity performance](copy-activity-performance.md) |
0 commit comments