Get different retrievers' context and just compress some of the retrievers' context #6226
Unanswered
gon-martinam
asked this question in
Q&A
Replies: 1 comment 2 replies
-
To achieve your goal of creating a chain that returns the answer along with the sources (documents) from three different retrievers, and ensuring the final schema contains Here's how you can refactor your code: import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { pull } from "langchain/hub";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { formatDocumentsAsString } from "langchain/util/document";
import { RunnableSequence, RunnablePassthrough, RunnableMap } from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { CohereRerank } from "@langchain/cohere";
// Load documents for parks, food, and beverages
const loaderParks = new CheerioWebBaseLoader("URL_FOR_PARKS");
const loaderFood = new CheerioWebBaseLoader("URL_FOR_FOOD");
const loaderBeverages = new CheerioWebBaseLoader("URL_FOR_BEVERAGES");
const docsParks = await loaderParks.load();
const docsFood = await loaderFood.load();
const docsBeverages = await loaderBeverages.load();
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 500, chunkOverlap: 0 });
const splitsParks = await textSplitter.splitDocuments(docsParks);
const splitsFood = await textSplitter.splitDocuments(docsFood);
const splitsBeverages = await textSplitter.splitDocuments(docsBeverages);
const vectorStoreParks = await MemoryVectorStore.fromDocuments(splitsParks, new OpenAIEmbeddings());
const vectorStoreFood = await MemoryVectorStore.fromDocuments(splitsFood, new OpenAIEmbeddings());
const vectorStoreBeverages = await MemoryVectorStore.fromDocuments(splitsBeverages, new OpenAIEmbeddings());
const retrieverParks = vectorStoreParks.asRetriever();
const retrieverFood = vectorStoreFood.asRetriever();
const retrieverBeverages = vectorStoreBeverages.asRetriever();
const prompt = await pull<ChatPromptTemplate>("rlm/rag-prompt");
const llm = new ChatOpenAI({ model: "gpt-3.5-turbo", temperature: 0 });
const compressor = new CohereRerank({ topN: 3 });
async function formatDocumentsAsString(
query: string,
documentsParks: Document[],
...documents: Document[][]
): Promise<string> {
const finalDocuments = await compressor.compressDocuments(documents.flat(), query);
const contextString = formatDocumentsAsString([...documentsParks, ...finalDocuments]);
return contextString;
}
const chainFromDocs = RunnableSequence.from([
RunnablePassthrough.assign({
context: async (input) => formatDocumentsAsString(
input.question as string,
input.contextParks as Document[],
input.contextFood as Document[],
input.contextBeverages as Document[]
),
}),
prompt,
llm,
new StringOutputParser()
]);
const chainMap = RunnableMap.from({
contextParks: retrieverParks,
contextFood: retrieverFood,
contextBeverages: retrieverBeverages,
question: new RunnablePassthrough()
});
const chainWithSource = chainMap.assign({
filteredContext: async (input) => {
const combinedDocs = [...input.contextFood, ...input.contextBeverages];
const compressedDocs = await compressor.compressDocuments(combinedDocs, input.question);
return [...input.contextParks, ...compressedDocs];
},
answer: chainFromDocs
});
await chainWithSource.invoke("Your question here"); This refactored code ensures that:
This approach aligns with LangChain's guidelines and philosophy, ensuring modularity and reusability of components [1][2][3]. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
In my use case I want to have a chain to return not only the answer in a RAG, but also the sources (documents). Also it is important to note that I have three different retrievers:
retrieverParks
,retrieverFood
, andretrieverBeverages
.What I would like to do is to have a final schema answer from the chain that contains:
For this I need to add the
filteredContext
to the code I currently have.filteredContext
will be the documents that are actually being used as the context, not all that have been retrieved by the different retrievers.As it is seen in the code, I currently have three different retrievers that I execute in parallel using a
RunnableMap
. I am applying a compression of the output of just two of these retrievers:retrieverFood
andretrieverBeverages
.That means that in the end the context I pass to the prompt and LLM is the compressed output of outputs from
retrieverFood
andretrieverBeverages
and the output without any kind of compression, i.e right out of the retriever, of the retrieverParks.I included the compression step into the
formatDocumentsAsString()
function due to the fact that, if I created aContextualCompressionRetriever
for each of these retrievers (retrieverFood
andretrieverBeverages
) I'd compress each of their outputs, but what I want to do is compress the combination of their outputs.In what way can I accomplish what I've described? You can also suggest how I should refactor my code in case I'm not following Langchain code guidelines or philosophy.
You can also suggest to modify the code completely if my approach is wrong from the beginning.
System Info
Langchain
Typescript
Beta Was this translation helpful? Give feedback.
All reactions