Skip to content

Commit f62fd4d

Browse files
authored
Merge pull request #298925 from Prakash496/csvdocfix
Removed postman references in csv ingestion file
2 parents a8da392 + 4db2148 commit f62fd4d

File tree

1 file changed

+249
-71
lines changed

1 file changed

+249
-71
lines changed

articles/energy-data-services/tutorial-csv-ingestion.md

Lines changed: 249 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -18,87 +18,265 @@ Comma-separated values (CSV) parser ingestion provides the capability to ingest
1818

1919
In this tutorial, you learn how to:
2020

21-
> [!div class="checklist"]
22-
>
23-
> * Ingest a sample wellbore data CSV file into an Azure Data Manager for Energy instance by using Postman.
24-
> * Search for storage metadata records created during CSV ingestion by using Postman.
21+
> * Ingest a sample wellbore data CSV file into an Azure Data Manager for Energy instance by using `cURL`.
22+
> * Search for storage metadata records created during CSV ingestion by using `cURL`.
2523
2624
## Prerequisites
27-
28-
Before you start this tutorial, complete the following prerequisites.
25+
* An Azure subscription
26+
* An instance of [Azure Data Manager for Energy](quickstart-create-microsoft-energy-data-services-instance.md) created in your Azure subscription
27+
* cURL command-line tool installed on your machine
28+
* Generate the service principal access token to call the Seismic APIs. See [How to generate auth token](how-to-generate-auth-token.md).
2929

3030
### Get details for the Azure Data Manager for Energy instance
3131

32-
* You need an Azure Data Manager for Energy instance. If you don't already have one, create one by following the steps in [Quickstart: Create an Azure Data Manager for Energy instance](quickstart-create-microsoft-energy-data-services-instance.md).
3332
* For this tutorial, you need the following parameters:
3433

35-
| Parameter | Value to use | Example | Where to find this value |
36-
| ------------------ | ------------------------ |-------------------------------------- |-------------------------------------- |
37-
| `CLIENT_ID` | Application (client) ID | `00001111-aaaa-2222-bbbb-3333cccc4444` | You use this app or client ID when registering the application with the Microsoft identity platform. See [Register an application](../active-directory/develop/quickstart-register-app.md#register-an-application). |
38-
| `CLIENT_SECRET` | Client secrets | `_fl******************` | Sometimes called an *application password*, a client secret is a string value that your app can use in place of a certificate to identity itself. See [Add a client secret](../active-directory/develop/quickstart-register-app.md#add-a-client-secret).|
39-
| `TENANT_ID` | Directory (tenant) ID | `72f988bf-86f1-41af-91ab-xxxxxxxxxxxx` | Hover over your account name in the Azure portal to get the directory or tenant ID. Alternately, search for and select **Microsoft Entra ID** > **Properties** > **Tenant ID** in the Azure portal. |
40-
| `SCOPE` | Application (client) ID | `00001111-aaaa-2222-bbbb-3333cccc4444` | This value is the same as the app or client ID mentioned earlier. |
41-
| `refresh_token` | Refresh token value | `0.ATcA01-XWHdJ0ES-qDevC6r...........` | Follow [How to generate auth token](how-to-generate-auth-token.md) to create a refresh token and save it. You need this refresh token later to generate a user token. |
42-
| `DNS` | URI | `<instance>.energy.Azure.com` | Find this value on the overview page of the Azure Data Manager for Energy instance.|
43-
| `data-partition-id` | Data partitions | `<data-partition-id>` | Find this value on the Data Partitions page of the Azure Data Manager for Energy instance.|
34+
| Parameter | Value to use | Example | Where to find this value |
35+
|----|----|----|----|
36+
| `DNS` | URI | `<instance>.energy.azure.com` | Find this value on the overview page of the Azure Data Manager for Energy instance. |
37+
| `data-partition-id` | Data partitions | `<data-partition-id>` | Find this value on the Data Partitions section within the Azure Data Manager for Energy instance. |
38+
| `access_token` | Access token value | `0.ATcA01-XWHdJ0ES-qDevC6r...........`| Follow [How to generate auth token](how-to-generate-auth-token.md) to create an access token and save it.|
4439

4540
Follow the [Manage users](how-to-manage-users.md) guide to add appropriate entitlements for the user who's running this tutorial.
4641

47-
### Set up Postman and execute requests
48-
49-
1. Download and install the [Postman](https://www.postman.com/) desktop app.
50-
51-
1. Import the following files into Postman:
52-
53-
* [CSV workflow Postman collection](https://raw.githubusercontent.com/microsoft/meds-samples/main/postman/IngestionWorkflows.postman_collection.json)
54-
* [CSV workflow Postman environment](https://raw.githubusercontent.com/microsoft/meds-samples/main/postman/IngestionWorkflowEnvironment.postman_environment.json)
55-
56-
To import the Postman collection and environment variables, follow the steps in [Importing data into Postman](https://learning.postman.com/docs/getting-started/importing-and-exporting-data/#importing-data-into-postman).
57-
58-
1. Update **CURRENT VALUE** for the Postman environment with the information that you obtained in the details of the Azure Data Manager for Energy instance.
59-
60-
1. The Postman collection for CSV parser ingestion contains 10 requests that you must execute sequentially.
61-
62-
Be sure to choose **Ingestion Workflow Environment** before you trigger the Postman collection.
63-
64-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-postman-choose-environment.png" alt-text="Screenshot of the Postman environment." lightbox="media/tutorial-csv-ingestion/tutorial-postman-choose-environment.png":::
65-
66-
1. Trigger each request by selecting the **Send** button.
67-
68-
On every request, Postman validates the actual API response code against the expected response code. If there's any mismatch, the test section indicates failures.
69-
70-
Here's an example of a successful Postman request:
71-
72-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-postman-test-success.png" alt-text="Screenshot of a successful Postman call." lightbox="media/tutorial-csv-ingestion/tutorial-postman-test-success.png":::
73-
74-
Here's an example of a failed Postman request:
75-
76-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-postman-test-failure.png" alt-text="Screenshot of a failed Postman call." lightbox="media/tutorial-csv-ingestion/tutorial-postman-test-failure.png":::
77-
78-
## Ingest wellbore data by using Postman
79-
80-
To ingest a sample wellbore data CSV file into the Azure Data Manager for Energy instance by using the Postman collection, complete the following steps:
81-
82-
1. **Get a User Access Token**: Generate the user token, which will be used to authenticate further API calls.
83-
1. **Create a Schema**: Generate a schema that adheres to the columns present in the CSV file.
84-
1. **Get Schema details**: Get the schema created in the previous step and validate it.
85-
1. **Create a Legal Tag**: Create a legal tag that will be added to the CSV data for data compliance purposes.
86-
1. **Get a signed URL for uploading a CSV file**: Get the signed URL path to which the CSV file will be uploaded.
87-
1. **Upload a CSV file**: Download the [Wellbore.csv](https://github.com/microsoft/meds-samples/blob/main/test-data/wellbore.csv) sample to your local machine, and then select this file in Postman by clicking the **Select File** button.
88-
89-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-select-csv-file.png" alt-text="Screenshot of uploading a CSV file." lightbox="media/tutorial-csv-ingestion/tutorial-select-csv-file.png":::
90-
1. **Upload CSV file metadata**: Upload the file metadata information, such as file location and other relevant fields.
91-
1. **Create a CSV Parser Ingestion Workflow**: Create the directed acyclic graph (DAG) for the CSV parser ingestion workflow.
92-
1. **Trigger a CSV Parser Ingestion Workflow**: Trigger the DAG for the CSV parser ingestion workflow.
93-
1. **Search for ingested CSV Parser Ingestion Workflow status**: Get the status of the CSV parser's DAG run.
94-
95-
## Search for ingested wellbore data by using Postman
96-
97-
To search for the storage metadata records created during the CSV ingestion by using the Postman collection, complete the following step:
98-
99-
* **Search for ingested CSV records**: Search for the CSV records created earlier.
100-
101-
:::image type="content" source="media/tutorial-csv-ingestion/tutorial-search-success.png" alt-text="Screenshot of searching ingested CSV records." lightbox="media/tutorial-csv-ingestion/tutorial-search-success.png":::
42+
### Set up your environment
43+
44+
Ensure you have `cURL` installed on your system. You will use it to make API calls.
45+
46+
## Ingest wellbore data by using `cURL`
47+
48+
To ingest a sample wellbore data CSV file into the Azure Data Manager for Energy instance, complete the following steps:
49+
Replace the placeholders (`<DNS>`, `<access_token>`, etc.) with the appropriate values.
50+
51+
### 1. Create a Schema
52+
53+
Run the following `cURL` command to create a schema:
54+
55+
```bash
56+
curl -X POST "https://<DNS>/api/schema-service/v1/schema" \
57+
-H "Authorization: Bearer <access_token>" \
58+
-H "Content-Type: application/json" \
59+
-H "data-partition-id: <data-partition-id>" \
60+
-d '{
61+
"schemaInfo": {
62+
"schemaIdentity": {
63+
"authority": "<data-partition-id>",
64+
"source": "shapeFiletest",
65+
"entityType": "testEntity",
66+
"schemaVersionPatch": 1,
67+
"schemaVersionMinor": 0,
68+
"schemaVersionMajor": 0
69+
},
70+
"status": "DEVELOPMENT"
71+
},
72+
"schema": {
73+
"$schema": "http://json-schema.org/draft-07/schema#",
74+
"title": "Wellbore",
75+
"type": "object",
76+
"properties": {
77+
"UWI": {
78+
"type": "string",
79+
"description": "Unique Wellbore Identifier"
80+
}
81+
}
82+
}
83+
}'
84+
```
85+
86+
**Sample Response:**
87+
```json
88+
{
89+
"id": "schema-12345",
90+
"status": "DEVELOPMENT"
91+
}
92+
```
93+
Save the `id` from the response for use in subsequent steps.
94+
95+
### 2. Create a Legal Tag
96+
97+
Run the following `cURL` command to create a legal tag:
98+
99+
```bash
100+
curl -X POST "https://<DNS>/api/legal/v1/legaltags" \
101+
-H "Authorization: Bearer <access_token>" \
102+
-H "Content-Type: application/json" \
103+
-H "data-partition-id: <data-partition-id>" \
104+
-d '{
105+
"name": "LegalTagName",
106+
"description": "Legal Tag added for Well",
107+
"properties": {
108+
"contractId": "123456",
109+
"countryOfOrigin": ["US", "CA"],
110+
"dataType": "Third Party Data",
111+
"exportClassification": "EAR99",
112+
"originator": "Schlumberger",
113+
"personalData": "No Personal Data",
114+
"securityClassification": "Private",
115+
"expirationDate": "2025-12-25"
116+
}
117+
}'
118+
```
119+
120+
**Sample Response:**
121+
```json
122+
{
123+
"name": "LegalTagName",
124+
"status": "Created"
125+
}
126+
```
127+
128+
### 3. Get a Signed URL for Uploading a CSV File
129+
130+
Run the following `cURL` command to get a signed URL:
131+
132+
```bash
133+
curl -X GET "https://<DNS>/api/file/v2/files/uploadURL" \
134+
-H "Authorization: Bearer <access_token>" \
135+
-H "data-partition-id: <data-partition-id>"
136+
```
137+
138+
**Sample Response:**
139+
```json
140+
{
141+
"SignedURL": "https://storageaccount.blob.core.windows.net/container/file.csv?sv=...",
142+
"FileSource": "file-source-12345"
143+
}
144+
```
145+
146+
Save the `SignedURL` and `FileSource` from the response for use in the next steps.
147+
148+
### 4. Upload a CSV File
149+
150+
Download the [Wellbore.csv](https://github.com/microsoft/meds-samples/blob/main/test-data/wellbore.csv) sample to your local machine. Then, run the following `cURL` command to upload the file:
151+
152+
```bash
153+
curl -X PUT -T "Wellbore.csv" "<SignedURL>" -H "x-ms-blob-type: BlockBlob"
154+
```
155+
156+
**Sample Response:**
157+
```json
158+
{
159+
"status": "Success"
160+
}
161+
```
162+
163+
### 5. Upload CSV File Metadata
164+
165+
Run the following `cURL` command to upload metadata for the CSV file:
166+
167+
```bash
168+
curl -X POST "https://<DNS>/api/file/v2/files/metadata" \
169+
-H "Authorization: Bearer <access_token>" \
170+
-H "Content-Type: application/json" \
171+
-H "data-partition-id: <data-partition-id>" \
172+
-d '{
173+
"kind": "osdu:wks:dataset--File.Generic:1.0.0",
174+
"acl": {
175+
"viewers": ["data.default.viewers@<data-partition-id>.dataservices.energy"],
176+
"owners": ["data.default.owners@<data-partition-id>.dataservices.energy"]
177+
},
178+
"legal": {
179+
"legaltags": ["<data-partition-id>-LegalTagName"],
180+
"otherRelevantDataCountries": ["US"],
181+
"status": "compliant"
182+
},
183+
"data": {
184+
"DatasetProperties": {
185+
"FileSourceInfo": {
186+
"FileSource": "<FileSource>"
187+
}
188+
}
189+
}
190+
}'
191+
```
192+
193+
**Sample Response:**
194+
```json
195+
{
196+
"id": "metadata-12345",
197+
"status": "Created"
198+
}
199+
```
200+
201+
Save the `id`, which is the uploaded file's id, from the response for use in the next step.
202+
203+
204+
### 6. Trigger a CSV Parser Ingestion Workflow
205+
206+
Run the following `cURL` command to trigger the ingestion workflow:
207+
208+
```bash
209+
curl -X POST "https://<DNS>/api/workflow/v1/workflow/csv-parser/workflowRun" \
210+
-H "Authorization: Bearer <access_token>" \
211+
-H "Content-Type: application/json" \
212+
-H "data-partition-id: <data-partition-id>" \
213+
-d '{
214+
"executionContext": {
215+
"id": "<uploadedFileId>",
216+
"dataPartitionId": "<data-partition-id>"
217+
}
218+
}'
219+
```
220+
221+
**Sample Response:**
222+
```json
223+
{
224+
"runId": "workflow-12345",
225+
"status": "Running"
226+
}
227+
```
228+
229+
Save the `runId` from the response for use in the next step.
230+
231+
### 7. Check the status of the workflow and wait for its completion.
232+
233+
Run the following `cURL` command to check the status of the workflow run:
234+
235+
```bash
236+
curl -X GET "https://<DNS>/api/workflow/v1/workflow/csv-parser/workflowRun/<runId>" \
237+
-H "Authorization: Bearer <access_token>" \
238+
-H "Content-Type: application/json" \
239+
-H "data-partition-id: <data-partition-id>"
240+
```
241+
242+
**Sample Response:**
243+
```json
244+
{
245+
"runId": "workflow-12345",
246+
"status": "Completed"
247+
}
248+
```
249+
250+
Keep checking every few seconds, until the response indicates a successful completion.
251+
252+
### 8. Search for Ingested CSV Records
253+
254+
Run the following `cURL` command to search for ingested records:
255+
256+
```bash
257+
curl -X POST "https://<DNS>/api/search/v2/query" \
258+
-H "Authorization: Bearer <access_token>" \
259+
-H "Content-Type: application/json" \
260+
-H "data-partition-id: <data-partition-id>" \
261+
-d '{
262+
"kind": "osdu:wks:dataset--File.Generic:1.0.0"
263+
}'
264+
```
265+
266+
**Sample Response:**
267+
```json
268+
{
269+
"results": [
270+
{
271+
"id": "dataset-12345",
272+
"kind": "osdu:wks:dataset--File.Generic:1.0.0",
273+
"status": "Available"
274+
}
275+
]
276+
}
277+
```
278+
279+
You should be able to see the records in the search results.
102280

103281
## Next step
104282

0 commit comments

Comments
 (0)