IBM embedding models; IBM watsonx Orchestrate tool example (#744)

Paul-Cornell · web-flow · commit e9d69cb9f73f · 2026-01-12T11:35:51.000-08:00
diff --git a/api-reference/workflow/workflows.mdx b/api-reference/workflow/workflows.mdx
@@ -2076,6 +2076,14 @@ Allowed values for `subtype` and `model_name` include the following:
   - `"model_name": "cohere.embed-english-v3"`
   - `"model_name": "cohere.embed-multilingual-v3"`
 
+- `"subtype": "ibm"`
+
+  - `"model_name": "sentence-transformers/all-minilm-l6-v2"`
+  - `"model_name": "ibm/granite-embedding-278m-multilingual"`
+  - `"model_name": "intfloat/multilingual-e5-large"`
+  - `"model_name": "ibm/slate-125m-english-rtrvr-v2"`
+  - `"model_name": "ibm/slate-125m-multilingual-rtrvr-v2"`
+
 - `"subtype": "togetherai"`
 
   - `"model_name": "togethercomputer/m2-bert-80M-32k-retrieval"`
diff --git a/docs.json b/docs.json
@@ -304,6 +304,7 @@
               "examplecode/tools/vectorshift",
               "examplecode/tools/mcp",
               "examplecode/tools/mcp-partition",
+              "examplecode/tools/ibm-orchestrate",
               "examplecode/tools/snowflake-streamlit",
               "examplecode/tools/crewai",
               "examplecode/tools/neo4j-chatbot"
diff --git a/examplecode/tools/ibm-orchestrate.mdx b/examplecode/tools/ibm-orchestrate.mdx
@@ -0,0 +1,175 @@
+---
+title: IBM watsonx Orchestrate
+---
+
+[IBM watsonx Orchestrate](https://www.ibm.com/products/watsonx-orchestrate) helps you build, deploy and manage powerful 
+AI assistants and agents that automate workflows and processes with generative AI.
+
+This article provides a hands-on, step-by-step walkthrough that uses IBM watsonx Orchestrate to 
+build a simple AI chat app. This chat app relies on data that is stored in an Astra DB or Milvus vector database. This data 
+is generated by Unstructured and is based on your organization's source documents and semi-structured data.
+
+## Requirements
+
+To use this example, you will need the following:
+
+If you want to connect Unstructured to Astra DB for providing the source data to IBM watsonx Orchestrate, you will need the following:
+
+- An [IBM Cloud account](https://cloud.ibm.com/registration) or [DataStax account](https://astra.datastax.com/signup) account.
+- An [Astra DB database](https://accounts.datastax.com/session-service/v1/login).
+- To complete this example, you will need the following settings for the Astra DB database:
+
+  - The database's API endpoint.
+  - An application token for the database.
+  - The name of the target keyspace in the database.
+  - The name of the target collection in the keyspace.
+
+  To get these settings, see the [Astra DB destination connector](/ui/destinations/astradb) documentation.
+
+- Within your Unstructured account, a workflow that contains an Astra DB destination connector.
+
+  - Create an [Unstructured account](https://unstructured.io/?modal=try-for-free).
+  - Create an [Astra DB destination connector](/ui/destinations/astradb) in your Unstructured account.
+  - Create a [custom workflow](/ui/workflows#create-a-custom-workflow) that contains the Astra DB destination connector in your Unstructured account.
+
+    The workflow must generate [embeddings](/ui/embedding). The workflow's **Embedder** node's selected embedding model provider must be **IBM**. The 
+    node's selected embedding model must match the embedding model that you will specify later in **Step 2**.
+
+    Be sure to [run the workflow](/ui/workflows#edit%2C-delete%2C-or-run-a-workflow) to have Unstructured generate the data and store it in your Astra DB database.
+
+After you meet the preceding requirements, skip ahead to [Step 1](#step-1%3A-add-an-ibm-watsonx-orchestrate-subscription-to-your-ibm-cloud-account).
+
+If, however, you want to connect Unstructured to Milvus on IBM watsonx.data for providing the source data to IBM watsonx Orchestrate, you will need the following:
+
+- Within your IBM Cloud account, an IBM watsonx.data subscription that contains a Milvus service instance.
+
+  - Create an [IBM Cloud account](https://cloud.ibm.com/registration).
+
+    To complete this example, you must [create an API key](https://www.ibm.com/docs/en/masv-and-l/cd?topic=cli-creating-your-cloud-api-key) for your IBM Cloud account.
+
+  - Create an [IBM watsonx.data subscription](https://cloud.ibm.com/watsonxdata) in your IBM Cloud account.
+  - Create a [Milvus service instance](/ui/destinations/milvus) within your IBM watsonx.data subscription plan. This instance 
+    must contain a database, a collection, and an index to store and manage the data that is generated by Unstructured. 
+
+    To complete this example, you will need the following settings for the Milvus service instance:
+    
+    - The instance's GRPC host value.
+    - The instance's GRPC port value.
+    - The name of the target database on the instance.
+    - The name of the target collection in the database.
+    - The name of the target index in the collection.
+
+    To get these settings, see the [Milvus destination connector](/ui/destinations/milvus) documentation.
+
+- Within your Unstructured account, a workflow that contains a Milvus destination connector.
+
+  - Create an [Unstructured account](https://unstructured.io/signup).
+  - Create a [Milvus destination connector](/ui/destinations/milvus) in your Unstructured account.
+  - Create a [custom workflow](/ui/workflows#create-a-custom-workflow) that contains the Milvus destination connector in your Unstructured account.
+
+    The workflow must generate [embeddings](/ui/embedding). The workflow's **Embedder** node's selected embedding model provider must be **IBM**. The 
+    node's selected embedding model must match the embedding model that you will specify later in **Step 2**.
+
+    Be sure to [run the workflow](/ui/workflows#edit%2C-delete%2C-or-run-a-workflow) to have Unstructured generate the data and store it in your Milvus instance's database.
+
+## Step 1: Add an IBM watsonx Orchestrate subscription to your IBM Cloud account
+
+If you already have an IBM watsonx Orchestrate subscription, then skip ahead to [Step 2](#step-2%3A-create-the-chat-app-in-ibm-watsonx-orchestrate)
+
+1. [Log in to your IBM Cloud account](https://cloud.ibm.com/login).
+2. On the sidebar, click the **Resource list** icon. If the sidebar is not visible, click the **Navigation Menu** icon to the far left of the 
+   top navigation bar.
+3. Click **Create resource**.
+4. With **IBM Cloud catalog** selected, search for and select **watsonx Orchestrate**.
+5. Complete the on-screen instructions to finish creating the IBM watsonx Orchestrate subscription.
+
+## Step 2: Create the chat app in IBM watsonx Orchestrate
+
+In this step, you use IBM watsonx Orchestrate to create a chat app. This chat app allows you to ask questions about 
+your organization's documents and semi-structured data. This data is stored in your Astra DB or Milvus vector database and was 
+generated by Unstructured in a format that is well-suited for your chat app.
+
+1. Open your IBM watsonx Orchestrate subscription, if it is not already open. To do this:
+
+   a. [Log in to your IBM Cloud account](https://cloud.ibm.com/login).<br/>
+   b. On the sidebar, click the **Resource list** icon. If the sidebar is not visible, click the **Navigation Menu** icon to the far left of the 
+      top navigation bar.<br/>
+   c. In the list of resources, expand **AI / Machine Learning**, and then click the target watsonx Orchestrate subscription.<br/>
+   d. Click **Launch watsonx Orchestrate**.<br/>
+
+2. If the **Chat** page is not already open, click the **Open the main menu** icon to the far left of the 
+   top navigation bar, and then click **Chat**.
+3. Toward the bottom of the sidebar, click **Create new agent**.
+4. On the **Create an agent** page, with **Create from scratch** already selected, enter some **Name** and **Description** for the agent, and then click **Create**.
+5. Under **Knowledge source**, in the **Start by adding knowledge** tile, click **Choose knowledge**.
+
+If you are using Astra DB for providing the source data to IBM watsonx Orchestrate, then do the following:
+
+1. On the **Select source** page, click **Astra DB**, and then click **Next**.
+2. On the **Connect Astra DB** page, specify the following settings:
+
+   a. For **URL**, enter the API endpoint for the Astra DB database.<br/>
+   b. Leave **Astra DB port** blank.<br/>
+   c. For **API key**, enter the application token for your Astra DB database.<br/>
+   d. Click **Next**.<br/>
+
+3. On the **Settings details** page, specify the following settings:
+
+   a. For **Keyspace**, select the name of the target keyspace in the Astra DB database.<br/>
+   b. For **Data type**, select **Collection**.<br/>
+   c. For **Collection**, select the name of the target collection in the database.<br/>
+   d. For **Embedding mode**, select **Client**.<br/>
+   e. For **Embedding model**, select the name of the embedding model that matches the one that you specified earlier in your Unstructured workflow.<br/>
+   f. For **Search mode**, select **Vector**.<br/>
+   g. For **Title**, enter `record_id`.<br/>
+   h. For **Body**, enter `content`.<br/>
+   i. Leave **URL** and **Advanced settings** blank.<br/>
+   j. Click **Next**.<br/>
+
+4. On the **Description** page, enter some description for the agent, and then click **Save**.
+5. In the list of pages, click **Behavior**.
+6. At the bottom of the **Behavior** section, turn on **Chat with documents**.
+7. In the upper-right corner of the page, click **Deploy**. 
+8. On the **Pre-deployment summary** page, click **Deploy**.
+
+Skip ahead to [Step 3](#step-3%3A-run-the-chat-app).
+
+If, however, you are using Milvus on IBM watsonx.data for providing the source data to IBM watsonx Orchestrate, then do the following:
+
+1. On the **Select source** page, click **Milvus**, and then click **Next**.
+2. On the **Connect Milvus** page, specify the following settings:
+
+   a. For **GRPC host**, enter the GRPC host value for the Milvus service instance within your IBM watsonx.data subscription.<br/>
+   b. For **GRPC port**, enter the GRPC port value for the instance.<br/>
+   c. For **Choose an authentication type**, select **Basic authentication**.<br/>
+   d. For **Username**, enter `ibmlhapikey`.<br/>
+   e. For **Password**, enter the API key for your IBM Cloud account.<br/>
+   f. Click **Next**.<br/>
+
+3. On the **Select index** page, specify the following settings:
+
+   a. For **Database**, select the name of the target database on the Milvus service instance within your IBM watsonx.data subscription.<br/>
+   b. For **Use Collection or Alias**, select **Collection**.<br/>
+   c. For **Collection**, select the name of the target collection in the database.<br/>
+   d. For **Index**, select the name of the target index in the collection.<br/>
+   e. For **Embedding model**, select the name of the embedding model that matches the one that you specified earlier in your Unstructured workflow.<br/>
+   f. For **Title**, select **element_id**.<br/>
+   g. For **Body**, select **text**.<br/>
+   h. Click **Next**.<br/>
+
+4. On the **Description** page, enter some description for the agent, and then click **Save**.
+5. At the bottom of the **Behavior** section, turn on **Chat with documents**.
+6. In the upper-right corner of the page, click **Deploy**. 
+7. On the **Pre-deployment summary** page, click **Deploy**.
+
+## Step 3: Run the chat app
+
+In this step, you ask questions about your organization's source documents and semi-structured data. The chat app then 
+attempts to answer your questions by searching the related data that Unstructured generated and stored in your Astra DB or Milvus vector database.
+
+1. If the **Chat** page is not already open in IBM watsonx Orchestrate, click the **Open the main menu** icon to the far left of the 
+   top navigation bar, and then click **Chat**.
+2. In the sidebar, in the **Agents** list, select the name of the agent that you created in the previous step.
+3. In the **Type something** box, enter a question, and then press `Enter`. 
+4. The agent will provide an answer.
+5. Keep asking as many questions as you want to.