You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2020-02-26 06:22:56.3190846, Warning, FileSkip, "sample1.csv", "File is skipped after read 548000000 bytes: ErrorCode=DataConsistencySourceDataChanged,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Source file 'sample1.csv' is changed by other clients during the copy activity run.,Source=,'."
150
150
```
151
-
From the log file above, you can see sample1.csv has been skipped because it failed to be verified to be consistent between source and destination store. You can get more details about why sample1.csv becomes inconsistent is because it was under changing by other applications when ADF copy activity is copying at the same time.
151
+
From the log file above, you can see sample1.csv has been skipped because it failed to be verified to be consistent between source and destination store. You can get more details about why sample1.csv becomes inconsistent is because it was underchanging by other applications when ADF copy activity is copying at the same time.
When you copy data from source to destination store, Azure Data Factory copy activity provides certain level of fault tolerances to prevent interruption from a sort of failures in the middle of data movement. For example, you are copying millions of rows from source to destination store, where a primary key has been created in the destination database, but source database does not have any primary keys. When you happen to copy duplicated rows from source to the destination, you will hit the PK violation failure on the destination database. At this moments, copy activity offers you two ways to handle such errors:
26
+
When you copy data from source to destination store, Azure Data Factory copy activity provides certain level of fault tolerances to prevent interruption from a sort of failures in the middle of data movement. For example, you are copying millions of rows from source to destination store, where a primary key has been created in the destination database, but source database does not have any primary keys. When you happen to copy duplicated rows from source to the destination, you will hit the PK violation failure on the destination database. At this moment, copy activity offers you two ways to handle such errors:
27
27
- You can abort the copy activity once any failure is encountered.
28
-
- You can continue to copy the rest by enabling fault tolerance to skip the incompatible data. For example, skip the duplicated row in this case. In addition, you can log the skipped data by enabling session log in copy activity.
28
+
- You can continue to copy the rest by enabling fault tolerance to skip the incompatible data. For example, skip the duplicated row in this case. In addition, you can log the skipped data by enabling session log within copy activity.
29
29
30
30
## Copying binary files
31
31
32
-
ADF supports the following fault tolerance scenario when copying binary files. You can choose to abort the copy activity or continue to copy the rest in the following scenario:
32
+
ADF supports the following fault tolerance scenario when copying binary files. You can choose to abort the copy activity or continue to copy the rest in the following scenarios:
33
33
34
-
1. The files to be copied by ADF are under deleting via other applications at the same time.
34
+
1. The files to be copied by ADF are being deleted by other applications at the same time.
35
35
2. Some particular folders or files do not allow ADF to access because ACLs of those files or folders require higher permission level than the connection configured in ADF.
36
-
3. One or more files are not verified to be consistent between source and destination store when ADF check their file size or MD5 after copying.
36
+
3. One or more files are not verified to be consistent between source and destination store if you enable data consistency verification setting in ADF.
37
37
38
38
### Configuration
39
39
When you copy binary files between storage stores, you can enable fault tolerance as followings:
@@ -71,8 +71,8 @@ When you copy binary files between storage stores, you can enable fault toleranc
skipErrorFile | A group of properties to specify the types of failures you want to skip during the data movement. | | No
74
-
fileMissing | One of the key-value pairs within skipErrorFile property bag to determine if you want to skip files which are being deleted by other applications when ADF is copying in the meanwhile. <br/> -True: you want to copy the rest by skipping the files being deleted by other applications. <br/> - False: you want to abort the copy activity once any files are being deleted from source store in the middle of data movement. <br/>Be aware this property is set to true as default. | True(default) <br/>False | No
75
-
fileForbidden | One of the key-value pairs within skipErrorFile property bag to determine if you want to skip the particular files which you do not have permission to access, read or write. <br/> -True: you want to copy the rest by skipping the files which you have the permission issue to access, read or write. <br/> - False: you want to abort the copy activity once meeting the permission issue to access, read or write files. | True <br/>False(default) | No
74
+
fileMissing | One of the key-value pairs within skipErrorFile property bag to determine if you want to skip files, which are being deleted by other applications when ADF is copying in the meanwhile. <br/> -True: you want to copy the rest by skipping the files being deleted by other applications. <br/> - False: you want to abort the copy activity once any files are being deleted from source store in the middle of data movement. <br/>Be aware this property is set to true as default. | True(default) <br/>False | No
75
+
fileForbidden | One of the key-value pairs within skipErrorFile property bag to determine if you want to skip the particular files, which you do not have permission to access. <br/> -True: you want to copy the rest by skipping the files, which you have the permission issue to access. <br/> - False: you want to abort the copy activity once meeting the permission issue to access. | True <br/>False(default) | No
76
76
dataInconsistency | One of the key-value pairs within skipErrorFile property bag to determine if you want to skip the inconsistent data between source and destination store. <br/> -True: you want to copy the rest by skipping inconsistent data. <br/> - False: you want to abort the copy activity once inconsistent data found. <br/>Be aware this property is only valid when you set validateDataConsistency as True. | True <br/>False(default) | No
77
77
logStorageSettings | A group of properties that can be specified when you want to log the skipped object names. | | No
78
78
linkedServiceName | The linked service of [Azure Blob Storage](connector-azure-blob-storage.md#linked-service-properties) or [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md#linked-service-properties) to store the session log file. | The names of an `AzureBlobStorage` or `AzureBlobFS` type linked service, which refers to the instance that you want to use to store the log file. | No
@@ -81,7 +81,7 @@ path | The path of the log file. | Specify the path that you want to use to stor
81
81
### Monitoring
82
82
83
83
#### Output from copy activity
84
-
You can get the number of files being read, written and skipped via the output of each copy activity run.
84
+
You can get the number of files being read, written, and skipped via the output of each copy activity run.
85
85
86
86
```json
87
87
"output": {
@@ -218,7 +218,7 @@ From the sample log file above, you can see one row "data1, data2, data3" has be
218
218
219
219
## Copying tabular data (Legacy):
220
220
221
-
The following is the legacy way to enable fault tolerance for copying non-binary data only. If you are creating new pipeline or activity, you are encouraged to start from [here](#fault-tolerance-for-copying-non-binary-data) instead.
221
+
The following is the legacy way to enable fault tolerance for copying non-binary data only. If you are creating new pipeline or activity, you are encouraged to start from [here](#copying-tabular-data) instead.
222
222
223
223
### Configuration
224
224
The following example provides a JSON definition to configure skipping the incompatible rows in Copy Activity:
0 commit comments