Skip to content

Commit 427c263

Browse files
authored
Workflow Endpoint: get job details, get failed files for job (#663)
1 parent 211f931 commit 427c263

File tree

2 files changed

+219
-2
lines changed

2 files changed

+219
-2
lines changed

api-reference/workflow/jobs.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,12 @@ To use the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) to
66

77
- To get a list of available jobs, use the `UnstructuredClient` object's `jobs.list_jobs` function (for the Python SDK) or
88
the `GET` method to call the `/jobs` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#list-jobs).
9-
- To get information about a job, use the `UnstructuredClient` object's `jobs.get_job` function (for the Python SDK) or
9+
- To get basic information about a job, use the `UnstructuredClient` object's `jobs.get_job` function (for the Python SDK) or
1010
the `GET` method to call the `/jobs/<job-id>` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#get-a-job).
11+
- To get information about a job's current processing status, use the `UnstructuredClient` object's `jobs.get_job_details` function (for the Python SDK) or
12+
the `GET` method to call the `/jobs/<job-id>/details` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#get-processing-details-for-a-job).
13+
- To get the list of any failed files for a job and why those files failed, use the `UnstructuredClient` object's `jobs.get_failed_files` function (for the Python SDK) or
14+
the `GET` method to call the `/jobs/<job-id>/failed-files` endpoint (for `curl` or Postman). [Learn more](/api-reference/workflow/overview#get-failed-file-details-for-a-job).
1115
- A job is created automatically whenever a workflow runs on a schedule; see [Create a workflow](/api-reference/workflow/workflows#create-a-workflow).
1216
A job is also created whenever you run a workflow manually; see [Run a workflow](/api-reference/workflow/overview#run-a-workflow).
1317
- To cancel a running job, use the `UnstructuredClient` object's `jobs.cancel_job` function (for the Python SDK) or

api-reference/workflow/overview.mdx

Lines changed: 214 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2356,10 +2356,19 @@ For `curl` or Postman, you can specify multiple query parameters as `?workflow_i
23562356

23572357
### Get a job
23582358

2359-
To get information about a job, use the `UnstructuredClient` object's `jobs.get_job` function (for the Python SDK) or
2359+
To get basic information about a job, use the `UnstructuredClient` object's `jobs.get_job` function (for the Python SDK) or
23602360
the `GET` method to call the `/jobs/<job-id>` endpoint (for `curl` or Postman), replacing
23612361
`<job-id>` with the job's unique ID. To get this ID, see [List jobs](#list-jobs).
23622362

2363+
This function/endpoint returns basic information about the job, such as:
2364+
2365+
- The job's unique ID.
2366+
- The unique ID and name of the workflow that created the job.
2367+
- The job's current status.
2368+
- When the job was created.
2369+
2370+
To get details about a job's current processing status, see [Get processing details for a job](#get-processing-details-for-a-job).
2371+
23632372
<AccordionGroup>
23642373
<Accordion title="Python SDK">
23652374
```python
@@ -2440,6 +2449,210 @@ the `GET` method to call the `/jobs/<job-id>` endpoint (for `curl` or Postman),
24402449
</Accordion>
24412450
</AccordionGroup>
24422451

2452+
### Get processing details for a job
2453+
2454+
To get current processing information about a job, use the `UnstructuredClient` object's `jobs.get_job_details` function (for the Python SDK) or
2455+
the `GET` method to call the `/jobs/<job-id>/details` endpoint (for `curl` or Postman), replacing
2456+
`<job-id>` with the job's unique ID. To get this ID, see [List jobs](#list-jobs).
2457+
2458+
To get basic information about a job, see [Get a job](#get-a-job).
2459+
2460+
<AccordionGroup>
2461+
<Accordion title="Python SDK">
2462+
```python
2463+
import os
2464+
2465+
from unstructured_client import UnstructuredClient
2466+
from unstructured_client.models.operations import GetJobDetailsRequest
2467+
2468+
client = UnstructuredClient(
2469+
api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
2470+
)
2471+
2472+
response = client.jobs.get_job_details(
2473+
request=GetJobDetailsRequest(
2474+
job_id="<job-id>"
2475+
)
2476+
)
2477+
2478+
info = response.job_details
2479+
2480+
print(f"job id: {info.id}")
2481+
print(f"processing status: {info.processing_status}")
2482+
print(f"message: {info.message}")
2483+
print(f"node stats:")
2484+
2485+
for node_stat in info.node_stats:
2486+
print(f"---")
2487+
print(f"name: {node_stat.node_name}")
2488+
print(f"type: {node_stat.node_type}")
2489+
print(f"subtype: {node_stat.node_subtype}")
2490+
print(f"ready: {node_stat.ready}")
2491+
print(f"in progress: {node_stat.in_progress}")
2492+
print(f"success: {node_stat.success}")
2493+
print(f"failure: {node_stat.failure}")
2494+
```
2495+
</Accordion>
2496+
<Accordion title="Python SDK (async)">
2497+
```python
2498+
import os
2499+
import asyncio
2500+
2501+
from unstructured_client import UnstructuredClient
2502+
from unstructured_client.models.operations import GetJobDetailsRequest
2503+
2504+
async def get_job_details():
2505+
client = UnstructuredClient(
2506+
api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
2507+
)
2508+
2509+
response = client.jobs.get_job_details(
2510+
request=GetJobDetailsRequest(
2511+
job_id="<job-id>"
2512+
)
2513+
)
2514+
2515+
info = response.job_details
2516+
2517+
print(f"job id: {info.id}")
2518+
print(f"processing status: {info.processing_status}")
2519+
print(f"message: {info.message}")
2520+
print(f"node stats:")
2521+
2522+
for node_stat in info.node_stats:
2523+
print(f"---")
2524+
print(f"name: {node_stat.node_name}")
2525+
print(f"type: {node_stat.node_type}")
2526+
print(f"subtype: {node_stat.node_subtype}")
2527+
print(f"ready: {node_stat.ready}")
2528+
print(f"in progress: {node_stat.in_progress}")
2529+
print(f"success: {node_stat.success}")
2530+
print(f"failure: {node_stat.failure}")
2531+
2532+
asyncio.run(get_job_details())
2533+
```
2534+
</Accordion>
2535+
<Accordion title="curl">
2536+
```bash
2537+
curl --request 'GET' --location \
2538+
"$UNSTRUCTURED_API_URL/jobs/<job-id>/details" \
2539+
--header 'accept: application/json' \
2540+
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"
2541+
```
2542+
</Accordion>
2543+
<Accordion title="Postman">
2544+
1. In the method drop-down list, select **GET**.
2545+
2. In the address box, enter the following URL:
2546+
2547+
```text
2548+
{{UNSTRUCTURED_API_URL}}/jobs/<job-id>/details
2549+
```
2550+
2551+
3. On the **Headers** tab, enter the following headers:
2552+
2553+
- **Key**: `unstructured-api-key`, **Value**: `{{UNSTRUCTURED_API_KEY}}`
2554+
- **Key**: `accept`, **Value**: `application/json`
2555+
2556+
4. Click **Send**.
2557+
</Accordion>
2558+
</AccordionGroup>
2559+
2560+
### Get failed file details for a job
2561+
2562+
To get the list of any failed files for a job and why those files failed, use the `UnstructuredClient` object's `jobs.get_job_failed_files` function (for the Python SDK) or
2563+
the `GET` method to call the `/jobs/<job-id>/failed-files` endpoint (for `curl` or Postman), replacing
2564+
`<job-id>` with the job's unique ID. To get this ID, see [List jobs](#list-jobs).
2565+
2566+
<AccordionGroup>
2567+
<Accordion title="Python SDK">
2568+
```python
2569+
import os
2570+
2571+
from unstructured_client import UnstructuredClient
2572+
from unstructured_client.models.operations import GetJobFailedFilesRequest
2573+
2574+
client = UnstructuredClient(
2575+
api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
2576+
)
2577+
2578+
response = client.jobs.get_job_failed_files(
2579+
request=GetJobFailedFilesRequest(
2580+
job_id="<job-id>"
2581+
)
2582+
)
2583+
2584+
info = response.job_failed_files
2585+
2586+
if info.failed_files.__len__() > 0:
2587+
print(f"{info.failed_files.__len__()} failed file(s):")
2588+
2589+
for failed_file in info.failed_files:
2590+
print(f"---")
2591+
print(f"document: {failed_file.document}")
2592+
print(f"error: {failed_file.error}")
2593+
else:
2594+
print(f"No failed files.")
2595+
```
2596+
</Accordion>
2597+
<Accordion title="Python SDK (async)">
2598+
```python
2599+
import os
2600+
import asyncio
2601+
2602+
from unstructured_client import UnstructuredClient
2603+
from unstructured_client.models.operations import GetJobFailedFilesRequest
2604+
2605+
async def get_job_failed_files():
2606+
client = UnstructuredClient(
2607+
api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
2608+
)
2609+
2610+
response = client.jobs.get_job_failed_files(
2611+
request=GetJobFailedFilesRequest(
2612+
job_id="<job-id>"
2613+
)
2614+
)
2615+
2616+
info = response.job_failed_files
2617+
2618+
if info.failed_files.__len__() > 0:
2619+
print(f"{info.failed_files.__len__()} failed file(s):")
2620+
2621+
for failed_file in info.failed_files:
2622+
print(f"---")
2623+
print(f"document: {failed_file.document}")
2624+
print(f"error: {failed_file.error}")
2625+
else:
2626+
print(f"No failed files.")
2627+
2628+
asyncio.run(get_job_failed_files())
2629+
```
2630+
</Accordion>
2631+
<Accordion title="curl">
2632+
```bash
2633+
curl --request 'GET' --location \
2634+
"$UNSTRUCTURED_API_URL/jobs/<job-id>/failed-files" \
2635+
--header 'accept: application/json' \
2636+
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"
2637+
```
2638+
</Accordion>
2639+
<Accordion title="Postman">
2640+
1. In the method drop-down list, select **GET**.
2641+
2. In the address box, enter the following URL:
2642+
2643+
```text
2644+
{{UNSTRUCTURED_API_URL}}/jobs/<job-id>/failed-files
2645+
```
2646+
2647+
3. On the **Headers** tab, enter the following headers:
2648+
2649+
- **Key**: `unstructured-api-key`, **Value**: `{{UNSTRUCTURED_API_KEY}}`
2650+
- **Key**: `accept`, **Value**: `application/json`
2651+
2652+
4. Click **Send**.
2653+
</Accordion>
2654+
</AccordionGroup>
2655+
24432656
### Cancel a job
24442657

24452658
To cancel a running job, use the `UnstructuredClient` object's `jobs.cancel_job` function (for the Python SDK) or

0 commit comments

Comments
 (0)