You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently planning on creating a RAG application using Langchain4J and I would like to get some input. The system will have two kinds of data:
Data provided by the system. This can be a a manual or other documents. It is data that will be cleaned up, split into chunks and ingested into qdrant. It might be a good idea to prepare all this data during development (or through some pipeline).
Data created by the user of the application. Users are able to create articles (e.g FAQ-texts) which we want to also add into our RAG data. If a user deletes such an article, we can delete it from qdrant, modify it etc.
Now when there is an update to the software, there might also be a new version of the manual and related data, which we will have to update then. (Maybe by delivering a new qdrant-snapshot file to replace the old collection or having some scripts to replace the data in the collection).
So I would like to get some input if that is a terrible idea or a valid plan for my use case. Are there problems with delivering qdrant-snapshots for data updates?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hey everyone!
I am currently planning on creating a RAG application using Langchain4J and I would like to get some input. The system will have two kinds of data:
Data provided by the system. This can be a a manual or other documents. It is data that will be cleaned up, split into chunks and ingested into qdrant. It might be a good idea to prepare all this data during development (or through some pipeline).
Data created by the user of the application. Users are able to create articles (e.g FAQ-texts) which we want to also add into our RAG data. If a user deletes such an article, we can delete it from qdrant, modify it etc.
Now when there is an update to the software, there might also be a new version of the manual and related data, which we will have to update then. (Maybe by delivering a new qdrant-snapshot file to replace the old collection or having some scripts to replace the data in the collection).
I thought about creating two collections: one for the users data and one for the systems data. The FAQ says that having multiple collections might be considered an antipattern: https://qdrant.tech/documentation/faq/qdrant-fundamentals/#how-many-collections-can-i-create
So I would like to get some input if that is a terrible idea or a valid plan for my use case. Are there problems with delivering qdrant-snapshots for data updates?
Beta Was this translation helpful? Give feedback.
All reactions