Skip to content

Commit 1a2e77d

Browse files
authored
[Hold] UI/API: IBM watsonx.data destination connector (#565)
1 parent 5588df2 commit 1a2e77d

File tree

11 files changed

+155
-1
lines changed

11 files changed

+155
-1
lines changed
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
title: IBM watsonx.data
3+
---
4+
5+
import FirstTimeAPIDestinationConnector from '/snippets/general-shared-text/first-time-api-destination-connector.mdx';
6+
7+
<FirstTimeAPIDestinationConnector />
8+
9+
Send processed data from Unstructured to IBM watsonx.data.
10+
11+
The requirements are as follows.
12+
13+
import IBMWatsonxdataPrerequisites from '/snippets/general-shared-text/ibm-watsonxdata.mdx';
14+
15+
<IBMWatsonxdataPrerequisites />
16+
17+
To create an IBM watsonx.data destination connector, see the following examples.
18+
19+
import IBMWatsonxdataSDK from '/snippets/destination_connectors/ibm_watsonxdata_sdk.mdx';
20+
import IBMWatsonxdataAPIRESTCreate from '/snippets/destination_connectors/ibm_watsonxdata_rest_create.mdx';
21+
22+
<CodeGroup>
23+
<IBMWatsonxdataSDK />
24+
<IBMWatsonxdataAPIRESTCreate />
25+
</CodeGroup>
26+
27+
Replace the preceding placeholders as follows:
28+
29+
import IBMWatsonxdataAPIPlaceholders from '/snippets/general-shared-text/ibm-watsonxdata-api-placeholders.mdx';
30+
31+
<IBMWatsonxdataAPIPlaceholders />
32+

api-reference/workflow/destinations/overview.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ For the list of specific settings, see:
3030
- [Delta Tables in Databricks](/api-reference/workflow/destinations/databricks-delta-table) (`DATABRICKS_VOLUME_DELTA_TABLES` for the Python SDK or `databricks_volume_delta_tables` for `curl` or Postman)
3131
- [Elasticsearch](/api-reference/workflow/destinations/elasticsearch) (`ELASTICSEARCH` for the Python SDK or `elasticsearch` for `curl` or Postman)
3232
- [Google Cloud Storage](/api-reference/workflow/destinations/google-cloud) (`GCS` for the Python SDK or `gcs` for `curl` or Postman)
33+
- [IBM watsonx.data](/api-reference/workflow/destinations/ibm-watsonxdata) (`IBM_WATSONX_S3` for the Python SDK or `ibm_watsonx_s3` for `curl` or Postman)
3334
- [Kafka](/api-reference/workflow/destinations/kafka) (`KAFKA_CLOUD` for the Python SDK or `kafka-cloud` for `curl` or Postman)
3435
- [Local](/api-reference/workflow/destinations/local) (Supported only for `curl` or Postman)
3536
- [Milvus](/api-reference/workflow/destinations/milvus) (`MILVUS` for the Python SDK or `milvus` for `curl` or Postman)

docs.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@
7474
"ui/destinations/databricks-delta-table",
7575
"ui/destinations/elasticsearch",
7676
"ui/destinations/google-cloud",
77+
"ui/destinations/ibm-watsonxdata",
7778
"ui/destinations/kafka",
7879
"ui/destinations/milvus",
7980
"ui/destinations/mongodb",
@@ -176,6 +177,7 @@
176177
"api-reference/workflow/destinations/databricks-delta-table",
177178
"api-reference/workflow/destinations/elasticsearch",
178179
"api-reference/workflow/destinations/google-cloud",
180+
"api-reference/workflow/destinations/ibm-watsonxdata",
179181
"api-reference/workflow/destinations/kafka",
180182
"api-reference/workflow/destinations/local",
181183
"api-reference/workflow/destinations/milvus",
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
```bash curl
2+
curl --request 'POST' --location \
3+
"$UNSTRUCTURED_API_URL/destinations" \
4+
--header 'accept: application/json' \
5+
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
6+
--header 'content-type: application/json' \
7+
--data \
8+
'{
9+
"name": "<name>",
10+
"type": "ibm_watsonx_s3",
11+
"config": {
12+
"iceberg_endpoint": "<iceberg-endpoint>",
13+
"object_storage_endpoint": "<object-storage-endpoint>",
14+
"object_storage_region": "<object-storage-region>",
15+
"iam_api_key": "<iam-api-key>",
16+
"access_key_id": "<access-key-id>",
17+
"secret_access_key": "<secret-access-key>",
18+
"catalog": "<catalog>",
19+
"namespace": "<namespace>",
20+
"table": "<table>",
21+
"max_retries": <max-retries>,
22+
"record_id_key": "<record-id-key>"
23+
}
24+
}'
25+
```
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
```python Python SDK
2+
import os
3+
4+
from unstructured_client import UnstructuredClient
5+
from unstructured_client.models.operations import CreateDestinationRequest
6+
from unstructured_client.models.shared import (
7+
CreateDestinationConnector,
8+
DestinationConnectorType,
9+
IbmWatsonxDestinationConnectorConfigInput
10+
)
11+
12+
with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as client:
13+
response = client.destinations.create_destination(
14+
request=CreateDestinationRequest(
15+
create_destination_connector=CreateDestinationConnector(
16+
name="<name>",
17+
type=DestinationConnectorType.IBM_WATSONX_S3,
18+
config=IbmWatsonxDestinationConnectorConfigInput(
19+
iceberg_endpoint="<iceberg-endpoint>",
20+
object_storage_endpoint="<object-storage-endpoint>",
21+
object_storage_region="<object-storage-region>",
22+
iam_api_key="<iam-api-key>",
23+
access_key_id="<access-key-id>",
24+
secret_access_key="<secret-access-key>",
25+
catalog="<catalog>",
26+
namespace="<namespace>",
27+
table="<table>",
28+
max_retries=<max-retries>,
29+
record_id_key="<record-id-key>"
30+
)
31+
)
32+
)
33+
)
34+
35+
print(response.destination_connector_information)
36+
```
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
- `<name>` (_required_) - A unique name for this connector.
2+
- `<iceberg-endpoint>` (_required_): The metastore REST endpoint for the target Apache Iceberg-based catalog within the IBM watsonx.data data store instance. Do not include `https://` in this value.
3+
- `<object-storage-endpoint>` (_required_): The public endpoint for the target bucket within the IBM Cloud Object Storage (COS) instance that is associated with the catalog. Do not include `https://` in this value.
4+
- `<object-storage-region>` (_required_): The region short ID (such as us-east) for the bucket.
5+
- `<iam-api-key>` (_required_): A valid API key value for the IBM Cloud account.
6+
- `<access-key-id>` (_required_): A valid hash-based message authentication code (HMAC) access key ID for the COS instance.
7+
- `<secret-access-key>` (_required_): The HMAC secret access key for the access key ID.
8+
- `<catalog>` (_required_): The name of the target Apache Iceberg-based catalog within the IBM watsonx.data data store instance.
9+
- `<namespace>` (_required_): The name of the target namespace (also known as a schema) within the catalog.
10+
- `<table>` (_required_): The name of the target table within the namespace (schema).
11+
- `<max-retries>`: The maximum number of retries for the upload process. The default is `5`.
12+
- `<record-id-key>`: The name of the column that uniquely identifies each record in the target table. The default is `record_id`.

snippets/general-shared-text/ibm-watsonxdata-cli-api.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ The following environment variables:
1313
- `IBM_IAM_API_KEY` - An API key for the target IBM Cloud account, represented by `--iam-api-key` (CLI) or `iam_api_key` (Python).
1414
- `IBM_COS_ACCESS_KEY` - An HMAC access key ID for the target IBM Cloud Object Storage (COS) instance, represented by `--access-key-id` (CLI) or `access_key_id` (Python).
1515
- `IBM_COS_SECRET_ACCESS_KEY` - The associated HMAC secret access key ID for the target HMAC access key, represented by `--secret-access-key` (CLI) or `secret_access_key` (Python).
16-
- `IBM_ICEBERG_CATALOG_METASTORE_REST_ENDPOINT` - The metastore REST endpoint value for the target Apache Iceberg catalog in the target IBM watsonx.data data store instance, represented by `--iceberg_endpoint` (CLI) or `iceberg_endpoint` (Python). Do not include `https://` in this value.
16+
- `IBM_ICEBERG_CATALOG_METASTORE_REST_ENDPOINT` - The metastore REST endpoint value for the target Apache Iceberg catalog in the target IBM watsonx.data data store instance, represented by `--iceberg-endpoint` (CLI) or `iceberg_endpoint` (Python). Do not include `https://` in this value.
1717
- `IBM_COS_BUCKET_PUBLIC_ENDPOINT` - The target COS instance's endpoint value, represented by `--object-storage-endpoint` (CLI) or `object_storage_endpoint` (Python).
1818
- `IBM_COS_BUCKET_REGION` - The target COS instance's region short ID, represented by `--object-storage-region` (CLI) or `object_storage_region` (Python).
1919
- `IBM_ICEBERG_CATALOG` - The name of the target Iceberg catalog, represented by `--catalog` (CLI) or `catalog` (Python).
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
Fill in the following fields:
2+
3+
- **Name** (_required_): A unique name for this connector.
4+
- **Iceberg Endpoint** (_required_): The metastore REST endpoint for the target Apache Iceberg-based catalog within the IBM watsonx.data data store instance. Do not include `https://` in this value.
5+
- **Object Storage Endpoint** (_required_): The public endpoint for the target bucket within the IBM Cloud Object Storage (COS) instance that is associated with the catalog. Do not include `https://` in this value.
6+
- **Object Storage Region** (_required_): The region short ID (such as us-east) for the bucket.
7+
- **IAM API Key** (_required_): A valid API key value for the IBM Cloud account.
8+
- **Access Key ID** (_required_): A valid hash-based message authentication code (HMAC) access key ID for the COS instance.
9+
- **Secret Access Key** (_required_): The HMAC secret access key for the access key ID.
10+
- **Catalog** (_required_): The name of the target Apache Iceberg-based catalog within the IBM watsonx.data data store instance.
11+
- **Namespace** (_required_): The name of the target namespace (also known as a schema) within the catalog.
12+
- **Table** (_required_): The name of the target table within the namespace (schema).
13+
- **Max Retries**: The maximum number of retries for the upload process. The default is `5`.
14+
- **Record ID Key**: The name of the column that uniquely identifies each record in the target table. The default is `record_id`.

ui/connectors.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ If your source is not listed here, you might still be able to connect Unstructur
4747
- [Delta Tables in Databricks](/ui/destinations/databricks-delta-table)
4848
- [Elasticsearch](/ui/destinations/elasticsearch)
4949
- [Google Cloud Storage](/ui/destinations/google-cloud)
50+
- [IBM watsonx.data](/ui/destinations/ibm-watsonxdata)
5051
- [Kafka](/ui/destinations/kafka)
5152
- [Milvus](/ui/destinations/milvus)
5253
- [MotherDuck](/ui/destinations/motherduck)
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
title: IBM watsonx.data
3+
---
4+
5+
import FirstTimeUIDestinationConnector from '/snippets/general-shared-text/first-time-ui-destination-connector.mdx';
6+
7+
<FirstTimeUIDestinationConnector />
8+
9+
Send processed data from Unstructured to IBM watsonx.data.
10+
11+
The requirements are as follows.
12+
13+
import IBMWatsonxdataPrerequisites from '/snippets/general-shared-text/ibm-watsonxdata.mdx';
14+
15+
<IBMWatsonxdataPrerequisites />
16+
17+
To create the destination connector:
18+
19+
1. On the sidebar, click **Connectors**.
20+
2. Click **Destinations**.
21+
3. Cick **New** or **Create Connector**.
22+
4. Give the connector some unique **Name**.
23+
5. In the **Provider** area, click **IBM watsonx.data**.
24+
6. Click **Continue**.
25+
7. Follow the on-screen instructions to fill in the fields as described later on this page.
26+
8. Click **Save and Test**.
27+
28+
import IBMWatsonxdataFields from '/snippets/general-shared-text/ibm-watsonxdata-platform.mdx';
29+
30+
<IBMWatsonxdataFields />

0 commit comments

Comments
 (0)