Skip to content

Commit 4b940f0

Browse files
authored
[Docs] HCD and DSE with RAGStack (#582)
* initial-content * add-langchain-hub * dse-69-example * typo
1 parent b1d2468 commit 4b940f0

File tree

3 files changed

+149
-0
lines changed

3 files changed

+149
-0
lines changed
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
= RAGStack and DataStax Enterprise (DSE) 6.9 example
2+
3+
. Pull the latest dse-server Docker image and confirm the container is in a running state.
4+
+
5+
[source,bash]
6+
----
7+
docker pull datastax/dse-server:6.9.0-rc.2
8+
docker run -e DS_LICENSE=accept -p 9042:9042 -d datastax/dse-server:6.9.0-rc.2
9+
----
10+
+
11+
. Install dependencies.
12+
+
13+
[source,bash]
14+
----
15+
pip install ragstack-ai-langchain python-dotenv langchainhub
16+
----
17+
+
18+
. Create a `.env` file in the root directory of the project and add the following environment variables.
19+
+
20+
[source,bash]
21+
----
22+
OPENAI_API_KEY="sk-..."
23+
----
24+
+
25+
. Create a Python script to embed and generate the results of a query.
26+
+
27+
include::examples:partial$hcd-quickstart.adoc[]
28+
+
29+
You should see output like this:
30+
+
31+
[source,plain]
32+
----
33+
Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable. Techniques like Chain of Thought and Tree of Thoughts help models decompose hard tasks and enhance performance by thinking step by step. This process allows for a better interpretation of the model's thinking process and can involve various methods such as simple prompting, task-specific instructions, or human inputs.
34+
----
35+
36+

docs/modules/examples/pages/hcd.adoc

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
= RAGStack and Hyper Converged Database (HCD) example
2+
3+
. Clone the HCD example repository.
4+
+
5+
[source,bash]
6+
----
7+
git clone [email protected]:datastax/astra-db-java.git
8+
cd astra-db-java
9+
----
10+
+
11+
. Build the Docker image and confirm the containers are in a running state.
12+
+
13+
[source,bash]
14+
----
15+
docker compose up -d
16+
docker compose ps
17+
----
18+
+
19+
. Install dependencies.
20+
+
21+
[source,bash]
22+
----
23+
pip install ragstack-ai-langchain python-dotenv langchainhub
24+
----
25+
+
26+
. Create a `.env` file in the root directory of the project and add the following environment variables.
27+
+
28+
[source,bash]
29+
----
30+
OPENAI_API_KEY="sk-..."
31+
----
32+
+
33+
. Create a Python script to embed and generate the results.
34+
+
35+
include::examples:partial$hcd-quickstart.adoc[]
36+
+
37+
You should see output like this:
38+
+
39+
[source,plain]
40+
----
41+
Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable. Techniques like Chain of Thought and Tree of Thoughts help models decompose hard tasks and enhance performance by thinking step by step. This process allows for a better interpretation of the model's thinking process and can involve various methods such as simple prompting, task-specific instructions, or human inputs.
42+
----
43+
44+
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
.Python
2+
[%collapsible%open]
3+
====
4+
[source,python]
5+
----
6+
import os
7+
from dotenv import load_dotenv
8+
import bs4
9+
from langchain import hub
10+
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
11+
from langchain_community.document_loaders import WebBaseLoader
12+
from langchain_core.output_parsers import StrOutputParser
13+
from langchain_core.runnables import RunnablePassthrough
14+
from langchain_text_splitters import RecursiveCharacterTextSplitter
15+
import cassio
16+
from cassio.table import MetadataVectorCassandraTable
17+
from langchain_community.vectorstores import Cassandra
18+
19+
# Load environment variables
20+
load_dotenv()
21+
openai_api_key = os.getenv("OPENAI_API_KEY")
22+
23+
# Initialize Cassandra
24+
cassio.init(contact_points=['localhost'], username='cassandra', password='cassandra')
25+
cassio.config.resolve_session().execute(
26+
"create keyspace if not exists my_vector_keyspace with replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};"
27+
)
28+
29+
# Create metadata Vector Cassandra Table
30+
mvct = MetadataVectorCassandraTable(table='my_vector_table', vector_dimension=1536, keyspace='my_vector_keyspace')
31+
32+
# Web loader configuration
33+
loader = WebBaseLoader(
34+
web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
35+
bs_kwargs=dict(
36+
parse_only=bs4.SoupStrainer(
37+
class_=("post-content", "post-title", "post-header")
38+
)
39+
),
40+
)
41+
docs = loader.load()
42+
43+
# Document splitting
44+
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
45+
splits = text_splitter.split_documents(docs)
46+
47+
# Vector store setup
48+
vectorstore = Cassandra.from_documents(documents=splits, embedding=OpenAIEmbeddings(), table_name='my_vector_table', keyspace='my_vector_keyspace', vector_dimension=1024)
49+
retriever = vectorstore.as_retriever()
50+
51+
# Language model setup
52+
llm = ChatOpenAI()
53+
54+
# Chain components
55+
def format_docs(docs):
56+
return "\n\n".join(doc.page_content for doc in docs)
57+
58+
rag_chain = (
59+
{"context": retriever | format_docs, "question": RunnablePassthrough()}
60+
| hub.pull("rlm/rag-prompt")
61+
| llm
62+
| StrOutputParser()
63+
)
64+
65+
# Invocation
66+
result = rag_chain.invoke("What is Task Decomposition?")
67+
print(result)
68+
----
69+
====

0 commit comments

Comments
 (0)