Skip to content

Commit 62d17a9

Browse files
committed
Readme LCEL update and rag.py docstrings
1 parent 78c19ab commit 62d17a9

File tree

5 files changed

+27
-17
lines changed

5 files changed

+27
-17
lines changed

assets/lcel_pipe_flow.png

96.4 KB
Loading

assets/snap1.png

143 KB
Loading

rag.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,9 @@
1313
from langchain.callbacks.base import BaseCallbackHandler
1414
from langchain_core.runnables import RunnablePassthrough
1515
from langchain_core.output_parsers import StrOutputParser
16-
1716
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
1817

19-
# for streaming in Streamlit without LECL
18+
################### for streaming in Streamlit without LECL ###################
2019
class StreamHandler(BaseCallbackHandler):
2120
def __init__(self, container, initial_text=""):
2221
self.container = container
@@ -25,7 +24,12 @@ def __init__(self, container, initial_text=""):
2524
def on_llm_new_token(self, token: str, **kwargs) -> None:
2625
self.text += token
2726
self.container.markdown(self.text)
28-
27+
# stream_handler = StreamHandler(st.empty())
28+
""" if you want to use streaming on your streamlit app, it's tricky to seperate model script \n
29+
and streamlit script if not using LECL, because llm will have to use 'streaming=True' \n
30+
and 'callbacks=[stream_handler]' and streamhandler uses st.empty() placeholder here which can't be first streamlit command.
31+
"""
32+
2933
####################### Data processing for vectorstore #################################
3034
pdf_folder_path = "./data_source"
3135
documents = []
@@ -91,7 +95,6 @@ def format_docs(docs):
9195

9296
n_gpu_layers = 1
9397
n_batch = 512
94-
# stream_handler = StreamHandler(st.empty())
9598

9699
llm = LlamaCpp(
97100
model_path="/Users/raunakanand/Documents/Work_R/llm_models/mistral-7b-v0.1.Q4_K_S.gguf",
@@ -105,7 +108,7 @@ def format_docs(docs):
105108
# callbacks=[StreamingStdOutCallbackHandler()]
106109
)
107110

108-
########## When using RetrievalQA chain from llm's chain ##########
111+
########## use when using RetrievalQA chain from llm's chain ##########
109112
qa = RetrievalQA.from_chain_type(llm=llm, chain_type='stuff',
110113
retriever=retriever,
111114
# return_source_documents=True,

readme.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,22 @@
1-
pypdf
2-
langchain
3-
transformers
4-
chromadb
5-
streamlit
6-
sentence-transformers
7-
# Example: METAL
8-
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
1+
# About
92

3+
This project runs a local llm agent based RAG model on langchain using new pipesyntax [LCEL](https://python.langchain.com/docs/expression_language/get_started)(LangChain Expression Language) as well as older LLM chains(RetrievalQA), see `rag.py`. <br> We are using LECL in rag.py for inference as it has a smooth output streaming generator output which is consumed by streamlit using 'write_stream' method.
104

5+
The model uses persistent ChromaDB for vector store, which takes all the pdf files in `data_source` directory (one pdf about titanic for demo).
6+
7+
The UI is built on streamlit, where the output of RAG model is streamed token on the streamlit app in a chat format, see `st_app.py`.
8+
9+
![image info](./assets/snap1.png)
10+
11+
### <u>LCEL - LangChain Expression Language</u>:
12+
Langchain composes chain of components in linux pip system like:</br>
13+
`chain = retriever | prompt | llm | Outputparser` </br>
14+
See implementation in `rag.py`
15+
16+
![image info](./assets/lcel_pipe_flow.png)
17+
18+
19+
For more: [Pinecone LCEL Article](https://www.pinecone.io/learn/series/langchain/langchain-expression-language/)
1120

1221
# Enviornment Setup
1322

st_app.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,7 @@
33

44
st.set_page_config(page_title="LLM Search Titaninc", page_icon=':robot:')
55
# st.header("Query PDF")
6-
st.title("Welcome")
7-
8-
# prompt = st.chat_input("Enter your message...")
6+
st.title("Welcome to Langchain RAG")
97

108
if ('messages' not in st.session_state):
119
st.session_state.messages = []
@@ -14,7 +12,7 @@
1412
with st.chat_message(message["role"]):
1513
st.markdown(message["content"])
1614

17-
prompt = st.chat_input("Enter your message...")
15+
prompt = st.chat_input("Enter your query about Titanic...")
1816

1917
if (prompt):
2018
st.session_state.messages.append({'role':'user', 'content': prompt})

0 commit comments

Comments
 (0)