Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# How to use LangFlow to ingest data into Astra DB to be used as the knowledge source of Agent Knowledge
This document explains how to use LangFlow to ingest data into Astra DB, to be used as the knowledge source of Agent Knowledge.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This document explains how to use LangFlow to ingest data into Astra DB, to be used as the knowledge source of Agent Knowledge.
This document explains how to use LangFlow to ingest data into Astra DB, which can then be used as a knowledge source in Agent Knowledge.


## Before you begin
1. Sign up for Astra DB

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before starting the ingestion process, ensure you have access to both Astra DB and LangFlow. This section provides links to sign up or install the necessary tools.

* To sign up for Astra DB, see [Sign up for Astra DB](https://astra.datastax.com/)
2. Get access to LangFlow
* To install LangFlow Desktop, see [Install LangFlow Desktop](https://www.langflow.org/desktop)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* To install LangFlow Desktop, see [Install LangFlow Desktop](https://www.langflow.org/desktop)
* To install the desktop version: [Install LangFlow Desktop](https://www.langflow.org/desktop)

* To sign up for managed LangFlow, see [Sign up for managed LangFlow](https://astra.datastax.com/langflow)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* To sign up for managed LangFlow, see [Sign up for managed LangFlow](https://astra.datastax.com/langflow)
* To use the managed version: [Sign up for managed LangFlow](https://astra.datastax.com/langflow)


## Table of contents
* [Step 1: Prepare Astra DB and collect the connection information](#step-1-prepare-astra-db-and-collect-the-connection-information)
* [Prepare Astra DB](#prepare-astra-db)
* [Collect the connection information](#collect-the-connection-information)
* [Step 2: Use LangFlow to ingest data into Astra DB](#step-2-use-langflow-to-ingest-data-into-astra-db)
* [Add the Astra DB component](#add-the-astra-db-component)
* [Add other components](#add-other-components)
* [Connect the components and run the ingestion](#connect-the-components-and-run-the-ingestion)
* [Step 3: Connect to Agent Knowledge in watsonx Orchestrate](#step-3-connect-to-agent-knowledge-in-watsonx-orchestrate)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 1: Prepare Astra DB and collect connection information
Prepare Astra DB
Collect connection information
Step 2: Ingest Data into Astra DB using LangFlow
Add the Astra DB component
Add other components
Connect components and run ingestion
Step 3: Connect to Agent Knowledge in watsonx Orchestrate

## Step 1: Prepare Astra DB and collect the connection information

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ingest data into Astra DB, you first need to set up your database and collection, and gather the necessary credentials.

### Prepare Astra DB
1. Login to Astra DB
2. Create a new database, or select an existing database
3. Go to `Data Explorer` tab > `Collections and Tables` drop down, select `Create collection`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go to the Data Explorer tab > Collections and Tables > Create Collection.

4. Enter `Collection name`, toggle on `Vector-enabled collection`, select `Embedding generation method`, `Embedding model`, `Dimension`, `Similarity metric`, click `Create collection`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enter the Collection name, enable Vector-enabled collection, and configure...

<img src="./assets/create-astra-db-collection.png" width="1080" height="574" />

### Collect the connection information
#### Token
1. Login to Astra DB
2. On the upper right of the portal, click `Settings` > `Tokens`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Go to Settings > Tokens on the top-right corner.

3. Choose `Role`, `Description`, `Expiration`, and click on `Generate token`
4. Take a note of the generated token

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. Take a note of the generated token
4. Save the generated token securely.


#### Database name and collection name
Take a note of the database name and the collection name as used in the `Prepare Astra DB` section.

## Step 2: Use LangFlow to ingest data into Astra DB
You can ingest data into Milvus vector database either through watsonx.ai or by using custom code.
### Add the `Astra DB` component
1. Launch LangFlow and create a project
2. In `Components` > `Vector Stores`, drag and drop `Astra DB` to the canvas
3. In the `Astra DB` component, fill in `Astra DB Application Token`
4. Select `Database` and `Collection` as used in `Step 1`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Launch LangFlow and create a new project.
  2. Under Components > Vector Stores, drag and drop the Astra DB component onto the canvas.
  3. Enter your Astra DB Application Token.
  4. Select the Database and Collection from Step 1.

### Add other components
1. In `Components` > `Data`, drag and drop `File` to the canvas, and choose the file(s) to upload

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. In `Components` > `Data`, drag and drop `File` to the canvas, and choose the file(s) to upload
1. In `Components` > `Data`, drag and drop `File` component and upload your file.

2. In `Components` > `Processing`, drag and drop `Split text` to the canvas, and enter `Chunk Overlap` and `Chunk Size`
3. In `Components` > `Outputs`, drag and drop `Chat Output` to the canvas

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Components > Outputs, add the Chat Output component.

### Connect the components and run the ingestion
1. Connect all the components as shown in the screen capture below
2. Click `Playground` > `Run Flow` to ingest the data
<img src="./assets/use-langflow-to-ingest-data.png" width="1080" height="574" />

**NOTE: By default, `_id` and `$vectorize` are the two main fields created in the Astra DB collection schema. When setting up Astra DB as content repository in Agent Knowledge, you must configure the `Title` and `Body` fields with these two fields.**


## Step 3: Connect to Agent Knowledge in watsonx Orchestrate

This option allows you to integrate with your Astra DB service through the Agent Knowledge feature of watsonx Orchestrate.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After ingestion, you can integrate Astra DB with Agent Knowledge in watsonx Orchestrate. This section points you to the official documentation for completing the connection setup.

For detailed instructions on setting up Astra DB through the Agent Knowledge feature of watsonx Orchestrate, see [Connecting to an Astra DB content repository](https://www.ibm.com/docs/en/watsonx/watson-orchestrate/base?topic=agents-connecting-astra-db-content-repository).