Skip to content

Commit 9f16b79

Browse files
committed
edit pass: troubleshoot-manifest-ingestion
1 parent 82f2a25 commit 9f16b79

File tree

1 file changed

+57
-59
lines changed

1 file changed

+57
-59
lines changed

articles/energy-data-services/troubleshoot-manifest-ingestion.md

Lines changed: 57 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.topic: troubleshooting-general
88
ms.date: 02/06/2023
99
---
1010

11-
# Troubleshoot manifest ingestion problems by Airflow task logs
11+
# Troubleshoot manifest ingestion problems by using Airflow task logs
1212

1313
This article helps you troubleshoot workflow problems with manifest ingestion in Azure Data Manager for Energy Preview by using Airflow task logs.
1414

@@ -22,60 +22,60 @@ One single manifest file is used to trigger the manifest ingestion workflow.
2222

2323
|DagTaskName value |Description |
2424
|---------|---------|
25-
|`Update_status_running_task` | Calls Workflow service and marks the status of DAG as running in the database. |
26-
|`Check_payload_type` | Validates whether the ingestion is of batch type or single manifest.|
27-
|`Validate_manifest_schema_task` | Ensures all the schema kinds mentioned in the manifest are present and there's referential schema integrity. All invalid values will be evicted from the manifest. |
28-
|`Provide_manifest_intergrity_task` | Validates references inside the OSDU™ R3 manifest and removes invalid entities. This operator is responsible for parent-child validation. All orphan-like entities will be logged and excluded from the validated manifest. Any external referenced records will be searched and in case not found, the manifest entity will be dropped. All surrogate key references are also resolved. |
29-
|`Process_single_manifest_file_task` | Performs ingestion of the final obtained manifest entities from the previous step, data records will be ingested via the storage service. |
30-
|`Update_status_finished_task` | Calls workflow service and marks the status of DAG as `finished` or `failed` in the database. |
25+
|`update_status_running_task` | Calls the workflow service and marks the status of the DAG as `running` in the database. |
26+
|`check_payload_type` | Validates whether the type of ingestion is batch or single manifest.|
27+
|`validate_manifest_schema_task` | Ensures that all the schema types mentioned in the manifest are present and there's referential schema integrity. All invalid values are evicted from the manifest. |
28+
|`provide_manifest_intergrity_task` | Validates references inside the OSDU™ R3 manifest and removes invalid entities. This operator is responsible for parent/child validation. All orphan-like entities are logged and excluded from the validated manifest. Any external referenced records are searched. If none are found, the manifest entity is dropped. All surrogate key references are also resolved. |
29+
|`process_single_manifest_file_task` | Performs ingestion of the final manifest entities obtained from the previous step. Data records are ingested via the storage service. |
30+
|`update_status_finished_task` | Calls the workflow service and marks the status of the DAG as `finished` or `failed` in the database. |
3131

3232
### Batch upload
3333

34-
Multiple manifest files are part of the same workflow service request, that is, the manifest section in the request payload is a list instead of a dictionary of items.
34+
Multiple manifest files are part of the same workflow service request. The manifest section in the request payload is a list instead of a dictionary of items.
3535

3636
|DagTaskName value |Description |
3737
|---------|---------|
38-
|`Update_status_running_task` | Calls Workflow service and marks the status of DAG as running in the database. |
39-
|`Check_payload_type` | Validates whether the ingestion is of batch type or single manifest.|
40-
|`Batch_upload` | List of manifests are divided into three batches to be processed in parallel (no task logs are emitted). |
41-
|`Process_manifest_task_(1 / 2 / 3)` | List of manifests is divided into groups of three and processed by these tasks. All the steps performed in Validate_manifest_schema_task, Provide_manifest_intergrity_task, Process_single_manifest_file_task are condensed and performed sequentially in these tasks. |
42-
|`Update_status_finished_task` | Calls workflow service and marks the status of DAG as `finished` or `failed` in the database. |
38+
|`update_status_running_task` | Calls the workflow service and marks the status of the DAG as `running` in the database. |
39+
|`check_payload_type` | Validates whether the type of ingestion is batch or single manifest.|
40+
|`batch_upload` | Divides the list of manifests into three batches to be processed in parallel. (No task logs are emitted.) |
41+
|`process_manifest_task_(1 / 2 / 3)` | Divides the list of manifests into groups of three and processes them. All the steps performed in `validate_manifest_schema_task`, `provide_manifest_intergrity_task`, and `process_single_manifest_file_task` are condensed and performed sequentially in these tasks. |
42+
|`update_status_finished_task` | Calls the workflow service and marks the status of the DAG as `finished` or `failed` in the database. |
4343

44-
Based on the payload type (single or batch), `check_payload_type` task will pick the appropriate branch and the tasks in the other branch will be skipped.
44+
Based on the payload type (single or batch), the `check_payload_type` task chooses the appropriate branch and skips the tasks in the other branch.
4545

4646
## Prerequisites
4747

48-
You should have integrated airflow task logs with Azure monitor. See [Integrate airflow logs with Azure Monitor](how-to-integrate-airflow-logs-with-azure-monitor.md)
48+
You should have integrated Airflow task logs with Azure Monitor. See [Integrate Airflow logs with Azure Monitor](how-to-integrate-airflow-logs-with-azure-monitor.md).
4949

50-
Following columns are exposed in Airflow Task Logs for you to debug the issue:
50+
The following columns are exposed in Airflow task logs for you to debug the problem:
5151

5252
|Parameter name |Description |
5353
|---------|---------|
54-
|`Run Id` | Unique run ID of the DAG run, which was triggered |
55-
|`Correlation ID` | Unique correlation ID of the DAG run (same as run ID) |
56-
|`DagName` | DAG workflow name. For instance, `Osdu_ingest` for manifest ingestion. |
57-
|`DagTaskName` | DAG workflow task name. For instance, `Update_status_running_task` for manifest ingestion. |
58-
|`Content` | Contains error log messages (errors/exceptions) emitted by Airflow during the task execution.|
59-
|`LogTimeStamp` | Captures the time interval of DAG runs. |
60-
|`LogLevel` | DEBUG/INFO/WARNING/ERROR. Mostly all exception and error messages can be seen by filtering at ERROR level. |
54+
|`RunID` | Unique run ID of the triggered DAG run. |
55+
|`CorrelationID` | Unique correlation ID of the DAG run (same as run ID). |
56+
|`DagName` | DAG workflow name. For instance, `Osdu_ingest` is the workflow name for manifest ingestion. |
57+
|`DagTaskName` | Task name for the DAG workflow. For instance, `update_status_running_task` is the task name for manifest ingestion. |
58+
|`Content` | Error log messages (errors or exceptions) that Airflow emits during the task execution.|
59+
|`LogTimeStamp` | Time interval of DAG runs. |
60+
|`LogLevel` | Level of the error. Values are `DEBUG`, `INFO`, `WARNING`, and `ERROR`. You can see most exception and error messages by filtering at the `ERROR` level. |
6161

62-
## A DAG run has failed in Update_status_running_task or Update_status_finished_task
62+
## Failed DAG run
6363

64-
The workflow run has failed and the data records weren't ingested.
64+
The workflow run has failed in `Update_status_running_task` or `Update_status_finished_task`, and the data records weren't ingested.
6565

6666
### Possible reasons
6767

68-
* Provided incorrect data partition ID.
69-
* Provided incorrect key name in the execution context of the request body.
70-
* Workflow service isn't running or throwing 5xx errors.
68+
* The data partition ID is incorrect.
69+
* A key name in the execution context of the request body is incorrect.
70+
* The workflow service isn't running or is throwing 5xx errors.
7171

7272
### Workflow status
7373

74-
Workflow status is marked as `failed`.
74+
The workflow status is marked as `failed`.
7575

7676
### Solution
7777

78-
Check the airflow task logs for `update_status_running_task` or `update_status_finished_task`. Fix the payload (pass the correct data partition ID or key name)
78+
Check the Airflow task logs for `update_status_running_task` or `update_status_finished_task`. Fix the payload by passing the correct data partition ID or key name.
7979

8080
Sample Kusto query:
8181

@@ -99,24 +99,24 @@ Sample trace output:
9999
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://contoso.energy.azure.com/api/workflow/v1/workflow/Osdu_ingest/workflowRun/e9a815f2-84f5-4513-9825-4d37ab291264
100100
```
101101

102-
## Schema validation failures
102+
## Failed schema validation
103103

104-
Records weren't ingested due to schema validation failures.
104+
Records weren't ingested because schema validation failed.
105105

106106
### Possible reasons
107107

108-
* Schema not found errors.
109-
* Manifest body not conforming to the schema kind.
110-
* Incorrect schema references.
111-
* Schema service throwing 5xx errors.
108+
* The schema service is throwing "Schema not found" errors.
109+
* The manifest body doesn't conform to the schema type.
110+
* The schema references are incorrect.
111+
* The schema service is throwing 5xx errors.
112112

113113
### Workflow status
114114

115-
Workflow status is marked as `finished`. No failure in the workflow status will be observed because the invalid entities are skipped and the ingestion is continued.
115+
The workflow status is marked as `finished`. You don't observe a failure in the workflow status because the invalid entities are skipped and the ingestion is continued.
116116

117117
### Solution
118118

119-
Check the airflow task logs for `validate_manifest_schema_task` or `process_manifest_task`. Fix the payload (pass the correct data partition ID or key name).
119+
Check the Airflow task logs for `validate_manifest_schema_task` or `process_manifest_task`. Fix the payload by passing the correct data partition ID or key name.
120120

121121
Sample Kusto query:
122122

@@ -156,21 +156,21 @@ Sample trace output:
156156

157157
## Failed reference checks
158158

159-
Records weren't ingested due to failed reference checks.
159+
Records weren't ingested because reference checks failed.
160160

161161
### Possible reasons
162162

163-
* Failed to find referenced records.
164-
* Parent records not found.
165-
* Search service throwing 5xx errors.
163+
* Referenced records weren't found.
164+
* Parent records weren't found.
165+
* The search service is throwing 5xx errors.
166166

167167
### Workflow status
168168

169-
Workflow status is marked as `finished`. No failure in the workflow status will be observed because the invalid entities are skipped and the ingestion is continued.
169+
The workflow status is marked as `finished`. You don't observe a failure in the workflow status because the invalid entities are skipped and the ingestion is continued.
170170

171171
### Solution
172172

173-
Check the airflow task logs for `provide_manifest_integrity_task` or `process_manifest_task`.
173+
Check the Airflow task logs for `provide_manifest_integrity_task` or `process_manifest_task`.
174174

175175
Sample Kusto query:
176176

@@ -182,41 +182,39 @@ Sample Kusto query:
182182
| where RunID has "<run_id>"
183183
```
184184

185-
Sample trace output:
186-
187-
Because there are no such error logs specifically for referential integrity tasks, you should watch out for the debug log statements to see whether all external records were fetched using the search service.
185+
Because there are no error logs specifically for referential integrity tasks, check the debug log statements to see whether all external records were fetched via the search service.
188186

189-
For instance, the output shows record queried using the Search service for referential integrity.
187+
For instance, the following sample trace output shows a record queried via the search service for referential integrity:
190188

191189
```md
192190
[2023-02-05, 19:14:40 IST] {search_record_ids.py:75} DEBUG - Search query "contoso-dp1:work-product-component--WellLog:5ab388ae0e140838c297f0e6559" OR "contoso-dp1:work-product-component--WellLog:5ab388ae0e1b40838c297f0e6559" OR "contoso-dp1:work-product-component--WellLog:5ab388ae0e1b40838c297f0e6559758a"
193191
```
194192

195-
The records that were retrieved and were in the system are shown in the output. The related manifest object that referenced a record would be dropped and no longer be ingested if we noticed that some of the records weren't present.
193+
The output shows the records that were retrieved and were in the system. The related manifest object that referenced a record would be dropped and no longer be ingested if you noticed that some of the records weren't present.
196194

197195
```md
198196
[2023-02-05, 19:14:40 IST] {search_record_ids.py:141} DEBUG - response ids: ['contoso-dp1:work-product-component--WellLog:5ab388ae0e1b40838c297f0e6559758a:1675590506723615', 'contoso-dp1:work-product-component--WellLog:5ab388ae0e1b40838c297f0e6559758a ']
199197
```
200198

201199
In the coming release, we plan to enhance the logs by appropriately logging skipped records with reasons.
202200

203-
## Invalid legal tags/ACLs in manifest
201+
## Invalid legal tags or ACLs in the manifest
204202

205-
Records weren't ingested due to invalid legal tags or ACLs present in the manifest.
203+
Records weren't ingested because the manifest contains invalid legal tags or access control lists (ACLs).
206204

207205
### Possible reasons
208206

209-
* Incorrect ACLs.
210-
* Incorrect legal tags.
211-
* Storage service throws 5xx errors.
207+
* ACLs are incorrect.
208+
* Legal tags are incorrect.
209+
* The storage service is throwing 5xx errors.
212210

213211
### Workflow status
214212

215-
Workflow status is marked as `finished`. No failure in the workflow status will be observed.
213+
The workflow status is marked as `finished`. You don't observe a failure in the workflow status.
216214

217215
### Solution
218216

219-
Check the airflow task logs for `process_single_manifest_file_task` or `process_manifest_task`.
217+
Check the Airflow task logs for `process_single_manifest_file_task` or `process_manifest_task`.
220218

221219
Sample Kusto query:
222220

@@ -237,7 +235,7 @@ Sample trace output:
237235

238236
```
239237

240-
The output indicates records that were retrieved. Manifest entity records corresponding to missing search records will get dropped and not ingested.
238+
The output indicates records that were retrieved. Manifest entity records that correspond to missing search records are dropped and not ingested.
241239

242240
```md
243241
"PUT /api/storage/v2/records HTTP/1.1" 400 None
@@ -247,12 +245,12 @@ The output indicates records that were retrieved. Manifest entity records corres
247245

248246
## Known issues
249247

250-
- Exception traces weren't exporting with Airflow Task Logs due to a known problem in the logs; the patch has been submitted and will be included in the February release.
248+
- Exception traces weren't exporting with Airflow task logs because of a known problem in the logs. The patch has been submitted and will be included in the February release.
251249
- Because there are no specific error logs for referential integrity tasks, you must manually search for the debug log statements to see whether all external records were retrieved via the search service. We intend to improve the logs in the upcoming release by properly logging skipped data with justifications.
252250

253251
## Next steps
254252

255-
Advance to the manifest ingestion tutorial and learn how to perform a manifest-based file ingestion:
253+
Advance to the following tutorial and learn how to perform a manifest-based file ingestion:
256254

257255
> [!div class="nextstepaction"]
258256
> [Tutorial: Sample steps to perform a manifest-based file ingestion](tutorial-manifest-ingestion.md)

0 commit comments

Comments
 (0)