Skip to content

Commit 9a985f6

Browse files
committed
Changed csv ingestion file
1 parent 0b873e2 commit 9a985f6

File tree

1 file changed

+180
-72
lines changed

1 file changed

+180
-72
lines changed

articles/energy-data-services/tutorial-csv-ingestion.md

Lines changed: 180 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -18,91 +18,199 @@ Comma-separated values (CSV) parser ingestion provides the capability to ingest
1818

1919
In this tutorial, you learn how to:
2020

21-
> [!div class="checklist"]
22-
>
23-
> * Ingest a sample wellbore data CSV file into an Azure Data Manager for Energy instance by using Postman.
24-
> * Search for storage metadata records created during CSV ingestion by using Postman.
21+
> * Ingest a sample wellbore data CSV file into an Azure Data Manager for Energy instance by using `curl`.
22+
> * Search for storage metadata records created during CSV ingestion by using `curl`.
2523
2624
## Prerequisites
27-
28-
Before you start this tutorial, complete the following prerequisites.
25+
* An Azure subscription
26+
* An instance of [Azure Data Manager for Energy](quickstart-create-microsoft-energy-data-services-instance.md) created in your Azure subscription
27+
* cURL command-line tool installed on your machine
28+
* Generate the service principal access token to call the Seismic APIs. See [How to generate auth token](how-to-generate-auth-token.md).
2929

3030
### Get details for the Azure Data Manager for Energy instance
3131

32-
* You need an Azure Data Manager for Energy instance. If you don't already have one, create one by following the steps in [Quickstart: Create an Azure Data Manager for Energy instance](quickstart-create-microsoft-energy-data-services-instance.md).
3332
* For this tutorial, you need the following parameters:
3433

35-
| Parameter | Value to use | Example | Where to find this value |
36-
| ------------------ | ------------------------ |-------------------------------------- |-------------------------------------- |
37-
| `CLIENT_ID` | Application (client) ID | `00001111-aaaa-2222-bbbb-3333cccc4444` | You use this app or client ID when registering the application with the Microsoft identity platform. See [Register an application](../active-directory/develop/quickstart-register-app.md#register-an-application). |
38-
| `CLIENT_SECRET` | Client secrets | `_fl******************` | Sometimes called an *application password*, a client secret is a string value that your app can use in place of a certificate to identity itself. See [Add a client secret](../active-directory/develop/quickstart-register-app.md#add-a-client-secret).|
39-
| `TENANT_ID` | Directory (tenant) ID | `72f988bf-86f1-41af-91ab-xxxxxxxxxxxx` | Hover over your account name in the Azure portal to get the directory or tenant ID. Alternately, search for and select **Microsoft Entra ID** > **Properties** > **Tenant ID** in the Azure portal. |
40-
| `SCOPE` | Application (client) ID | `00001111-aaaa-2222-bbbb-3333cccc4444` | This value is the same as the app or client ID mentioned earlier. |
41-
| `refresh_token` | Refresh token value | `0.ATcA01-XWHdJ0ES-qDevC6r...........` | Follow [How to generate auth token](how-to-generate-auth-token.md) to create a refresh token and save it. You need this refresh token later to generate a user token. |
42-
| `DNS` | URI | `<instance>.energy.Azure.com` | Find this value on the overview page of the Azure Data Manager for Energy instance.|
43-
| `data-partition-id` | Data partitions | `<data-partition-id>` | Find this value on the Data Partitions page of the Azure Data Manager for Energy instance.|
34+
| Parameter | Value to use | Example | Where to find this value |
35+
|----|----|----|----|
36+
| `DNS` | URI | `<instance>.energy.Azure.com` | Find this value on the overview page of the Azure Data Manager for Energy instance. |
37+
| `data-partition-id` | Data partitions | `<data-partition-id>` | Find this value on the Data Partitions page of the Azure Data Manager for Energy instance. |
38+
| `access_token` | Access token value | `0.ATcA01-XWHdJ0ES-qDevC6r...........`| Follow [How to generate auth token](how-to-generate-auth-token.md) to create an access token and save it.|
4439

4540
Follow the [Manage users](how-to-manage-users.md) guide to add appropriate entitlements for the user who's running this tutorial.
4641

47-
### Set up Postman and execute requests
48-
49-
1. Download and install the [Postman](https://www.postman.com/) desktop app.
50-
51-
1. Import the following files into Postman:
52-
53-
* [CSV workflow Postman collection](https://raw.githubusercontent.com/microsoft/meds-samples/main/postman/IngestionWorkflows.postman_collection.json)
54-
* [CSV workflow Postman environment](https://raw.githubusercontent.com/microsoft/meds-samples/main/postman/IngestionWorkflowEnvironment.postman_environment.json)
55-
56-
To import the Postman collection and environment variables, follow the steps in [Importing data into Postman](https://learning.postman.com/docs/getting-started/importing-and-exporting-data/#importing-data-into-postman).
57-
58-
1. Update **CURRENT VALUE** for the Postman environment with the information that you obtained in the details of the Azure Data Manager for Energy instance.
59-
60-
1. The Postman collection for CSV parser ingestion contains 10 requests that you must execute sequentially.
61-
62-
Be sure to choose **Ingestion Workflow Environment** before you trigger the Postman collection.
63-
64-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-postman-choose-environment.png" alt-text="Screenshot of the Postman environment." lightbox="media/tutorial-csv-ingestion/tutorial-postman-choose-environment.png":::
65-
66-
1. Trigger each request by selecting the **Send** button.
67-
68-
On every request, Postman validates the actual API response code against the expected response code. If there's any mismatch, the test section indicates failures.
69-
70-
Here's an example of a successful Postman request:
71-
72-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-postman-test-success.png" alt-text="Screenshot of a successful Postman call." lightbox="media/tutorial-csv-ingestion/tutorial-postman-test-success.png":::
73-
74-
Here's an example of a failed Postman request:
75-
76-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-postman-test-failure.png" alt-text="Screenshot of a failed Postman call." lightbox="media/tutorial-csv-ingestion/tutorial-postman-test-failure.png":::
77-
78-
## Ingest wellbore data by using Postman
79-
80-
To ingest a sample wellbore data CSV file into the Azure Data Manager for Energy instance by using the Postman collection, complete the following steps:
81-
82-
1. **Get a User Access Token**: Generate the user token, which will be used to authenticate further API calls.
83-
1. **Create a Schema**: Generate a schema that adheres to the columns present in the CSV file.
84-
1. **Get Schema details**: Get the schema created in the previous step and validate it.
85-
1. **Create a Legal Tag**: Create a legal tag that will be added to the CSV data for data compliance purposes.
86-
1. **Get a signed URL for uploading a CSV file**: Get the signed URL path to which the CSV file will be uploaded.
87-
1. **Upload a CSV file**: Download the [Wellbore.csv](https://github.com/microsoft/meds-samples/blob/main/test-data/wellbore.csv) sample to your local machine, and then select this file in Postman by clicking the **Select File** button.
88-
89-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-select-csv-file.png" alt-text="Screenshot of uploading a CSV file." lightbox="media/tutorial-csv-ingestion/tutorial-select-csv-file.png":::
90-
1. **Upload CSV file metadata**: Upload the file metadata information, such as file location and other relevant fields.
91-
1. **Create a CSV Parser Ingestion Workflow**: Create the directed acyclic graph (DAG) for the CSV parser ingestion workflow.
92-
1. **Trigger a CSV Parser Ingestion Workflow**: Trigger the DAG for the CSV parser ingestion workflow.
93-
1. **Search for ingested CSV Parser Ingestion Workflow status**: Get the status of the CSV parser's DAG run.
94-
95-
## Search for ingested wellbore data by using Postman
96-
97-
To search for the storage metadata records created during the CSV ingestion by using the Postman collection, complete the following step:
98-
99-
* **Search for ingested CSV records**: Search for the CSV records created earlier.
100-
101-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-search-success.png" alt-text="Screenshot of searching ingested CSV records." lightbox="media/tutorial-csv-ingestion/tutorial-search-success.png":::
42+
### Set up your environment
43+
44+
Ensure you have `curl` installed on your system. You will use it to make API calls.
45+
46+
## Ingest wellbore data by using `curl`
47+
48+
To ingest a sample wellbore data CSV file into the Azure Data Manager for Energy instance, complete the following steps:
49+
50+
### 1. Create a Schema
51+
52+
Run the following `curl` command to create a schema:
53+
54+
```bash
55+
curl -X POST "https://<DNS>/api/schema-service/v1/schema" \
56+
-H "Authorization: Bearer <access_token>" \
57+
-H "Content-Type: application/json" \
58+
-H "data-partition-id: <data-partition-id>" \
59+
-d '{
60+
"schemaInfo": {
61+
"schemaIdentity": {
62+
"authority": "<data-partition-id>",
63+
"source": "shapeFiletest",
64+
"entityType": "testEntity",
65+
"schemaVersionPatch": 1,
66+
"schemaVersionMinor": 0,
67+
"schemaVersionMajor": 0
68+
},
69+
"status": "DEVELOPMENT"
70+
},
71+
"schema": {
72+
"$schema": "http://json-schema.org/draft-07/schema#",
73+
"title": "Wellbore",
74+
"type": "object",
75+
"properties": {
76+
"UWI": {
77+
"type": "string",
78+
"description": "Unique Wellbore Identifier"
79+
}
80+
}
81+
}
82+
}'
83+
```
84+
85+
Replace the placeholders (`<DNS>`, `<access_token>`, etc.) with the appropriate values. Save the `id` from the response for use in subsequent steps.
86+
87+
### 2. Create a Legal Tag
88+
89+
Run the following `curl` command to create a legal tag:
90+
91+
```bash
92+
curl -X POST "https://<DNS>/api/legal/v1/legaltags" \
93+
-H "Authorization: Bearer <access_token>" \
94+
-H "Content-Type: application/json" \
95+
-H "data-partition-id: <data-partition-id>" \
96+
-d '{
97+
"name": "LegalTagName",
98+
"description": "Legal Tag added for Well",
99+
"properties": {
100+
"contractId": "123456",
101+
"countryOfOrigin": ["US", "CA"],
102+
"dataType": "Third Party Data",
103+
"exportClassification": "EAR99",
104+
"originator": "Schlumberger",
105+
"personalData": "No Personal Data",
106+
"securityClassification": "Private",
107+
"expirationDate": "2025-12-25"
108+
}
109+
}'
110+
```
111+
112+
### 3. Get a Signed URL for Uploading a CSV File
113+
114+
Run the following `curl` command to get a signed URL:
115+
116+
```bash
117+
curl -X GET "https://<DNS>/api/file/v2/files/uploadURL" \
118+
-H "Authorization: Bearer <access_token>" \
119+
-H "data-partition-id: <data-partition-id>"
120+
```
121+
122+
Save the `SignedURL` and `FileSource` from the response for use in the next steps.
123+
124+
### 4. Upload a CSV File
125+
126+
Download the [Wellbore.csv](https://github.com/microsoft/meds-samples/blob/main/test-data/wellbore.csv) sample to your local machine. Then, run the following `curl` command to upload the file:
127+
128+
```bash
129+
curl -X PUT -T "Wellbore.csv" "<SignedURL>" -H "x-ms-blob-type: BlockBlob"
130+
```
131+
132+
### 5. Upload CSV File Metadata
133+
134+
Run the following `curl` command to upload metadata for the CSV file:
135+
136+
```bash
137+
curl -X POST "https://<DNS>/api/file/v2/files/metadata" \
138+
-H "Authorization: Bearer <access_token>" \
139+
-H "Content-Type: application/json" \
140+
-H "data-partition-id: <data-partition-id>" \
141+
-d '{
142+
"kind": "osdu:wks:dataset--File.Generic:1.0.0",
143+
"acl": {
144+
"viewers": ["data.default.viewers@<data-partition-id>.dataservices.energy"],
145+
"owners": ["data.default.owners@<data-partition-id>.dataservices.energy"]
146+
},
147+
"legal": {
148+
"legaltags": ["<data-partition-id>-LegalTagName"],
149+
"otherRelevantDataCountries": ["US"],
150+
"status": "compliant"
151+
},
152+
"data": {
153+
"DatasetProperties": {
154+
"FileSourceInfo": {
155+
"FileSource": "<FileSource>"
156+
}
157+
}
158+
}
159+
}'
160+
```
161+
Save the `id`, which is the uploaded file's id, from the response for use in the next step.
162+
163+
164+
### 6. Trigger a CSV Parser Ingestion Workflow
165+
166+
Run the following `curl` command to trigger the ingestion workflow:
167+
168+
```bash
169+
curl -X POST "https://<DNS>/api/workflow/v1/workflow/csv-parser/workflowRun" \
170+
-H "Authorization: Bearer <access_token>" \
171+
-H "Content-Type: application/json" \
172+
-H "data-partition-id: <data-partition-id>" \
173+
-d '{
174+
"executionContext": {
175+
"id": "<uploadedFileId>",
176+
"dataPartitionId": "<data-partition-id>"
177+
}
178+
}'
179+
```
180+
Save the `runId` from the response for use in the next step.
181+
182+
### 7. Check the status of the workflow and wait for its completion.
183+
184+
Run the following `curl` command to check the status of the workflow run:
185+
186+
```bash
187+
curl -X GET "https://<DNS>/api/workflow/v1/workflow/csv-parser/workflowRun/<runId>" \
188+
-H "Authorization: Bearer <access_token>" \
189+
-H "Content-Type: application/json" \
190+
-H "data-partition-id: <data-partition-id>"
191+
```
192+
Keep checking every few seconds, until the response indicates a sccessful completion.
193+
194+
### 8. Search for Ingested CSV Records
195+
196+
Run the following `curl` command to search for ingested records:
197+
198+
```bash
199+
curl -X POST "https://<DNS>/api/search/v2/query" \
200+
-H "Authorization: Bearer <access_token>" \
201+
-H "Content-Type: application/json" \
202+
-H "data-partition-id: <data-partition-id>" \
203+
-d '{
204+
"kind": "osdu:wks:dataset--File.Generic:1.0.0"
205+
}'
206+
```
207+
208+
You should be able to see the records in the search results.
102209

103210
## Next step
104211

105212
Advance to the next tutorial:
106213

107-
> [!div class="nextstepaction"]
108214
> [Tutorial: Perform manifest-based file ingestion](tutorial-manifest-ingestion.md)
215+
216+

0 commit comments

Comments
 (0)