Replies: 1 comment
-
¡Hola @jaelgu! Soy Dosu, un bot aquí para ayudarte con tus preguntas y problemas. Estoy aquí para ayudarte mientras esperas a un mantenedor humano. ¡Vamos a resolver este problema juntos! 🤖 Yes, it is possible to store different fields in the docstore and vectorstore for document nodes processed by a sentence splitter. Here is an example of how you can achieve this:
from llama_index.storage.docstore.mongodb import MongoDocumentStore
from llama_index.core.node_parser import SentenceSplitter
from llama_index.vector_store import VectorStoreIndex
from llama_index.storage.context import StorageContext
# Create parser and parse document into nodes
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)
# Create (or load) docstore and add nodes
docstore = MongoDocumentStore.from_uri(uri="<mongodb+srv://...>")
docstore.add_documents(nodes)
# Create storage context
storage_context = StorageContext.from_defaults(docstore=docstore)
# Build index
index = VectorStoreIndex(nodes, storage_context=storage_context)
from llama_index.core import Document
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.extractors import TitleExtractor
from llama_index.core.ingestion import IngestionPipeline
# Create the pipeline with transformations
pipeline = IngestionPipeline(
transformations=[
SentenceSplitter(chunk_size=25, chunk_overlap=0),
TitleExtractor(),
OpenAIEmbedding(),
]
)
# Run the pipeline
nodes = pipeline.run(documents=[Document.example()])
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core.schema import TextNode
nodes = [
TextNode(
text="Michael Jordan is a retired professional basketball player, widely regarded as one of the greatest basketball players of all time.",
metadata={"category": "Sports", "country": "United States"},
),
TextNode(
text="Angelina Jolie is an American actress, filmmaker, and humanitarian. She has received numerous awards for her acting and is known for her philanthropic work.",
metadata={"category": "Entertainment", "country": "United States"},
),
# Add more nodes as needed
]
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(nodes, storage_context=storage_context) These examples show how you can configure different fields for the docstore and vectorstore by customizing the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have document nodes processed by sentence splitter but they include many fields in metadata. Is it possible to store different fields in docstore and vectorstore?
Beta Was this translation helpful? Give feedback.
All reactions