How to update chroma's index cache #8147

tpanda09 · 2023-07-23T12:13:05Z

tpanda09
Jul 23, 2023

After loading the content from the local file, I made a split and saved them in chroma. When the file content is updated, I will update(delete and add) the data into db according to the 'source' col in the metadata. During this period, I found that during the query call process of chroma, the index of the db will eventually be called, and this index object is obtained from the cache according to the collectionid. I found that I need to update the cache when updating the db data to ensure data consistency, but I can't find a way to update the cache.

update the content from local file after loading and splitting

collection_filter = db_gpt.get(where={"source": file_path})
db_gpt.delete(collection_filter['ids'])
db_gpt.add_documents(split_documents)

db persistance

db_gpt.persist()

search sth

docs_gpt = db_gpt.similarity_search("****")

Running as above, in the case that the content has actually been updated, I can’t search for anything, but when I run it for the second time, I don’t do the update operation and search directly, and the result is consistent with the expectation.After tracking the code, I found that it was caused by whether to use the cache when obtaining the index object, so I think that the cache was not refreshed when updating chromadb

2023-07-23T14:39:37Z

dosubot[bot]
bot Jul 23, 2023

Answer generated by a 🤖

Answer

I understand that you're having trouble with updating the cache in LangChain after updating the database. You've correctly identified that the cache needs to be refreshed to ensure data consistency. You've also noticed that the index object, which is obtained from the cache according to the collection ID, is not updated when you update the database.

To address this issue, you can use the update method provided by the cache classes in LangChain. This method is designed to update the cache with new data. Here's how you can use it in your case:

# update the content from local file after loading and splitting
collection_filter = db_gpt.get(where={"source": file_path})
db_gpt.delete(collection_filter['ids'])
db_gpt.add_documents(split_documents)
# db persistance
db_gpt.persist()

# update the cache
cache.update(prompt, llm_string, return_val)

# search sth
docs_gpt = db_gpt.similarity_search("****")

In this code, prompt is the query you want to search, llm_string is the language model version and settings, and return_val is the result you want to cache. You'll need to replace these placeholders with your actual values.

This solution should work regardless of the cache type you're using, as the update method is available in all cache classes (InMemoryCache, SQLAlchemyCache, SQLiteCache).

Please note that this is one potential solution based on the information you've provided. If you encounter any issues or if this doesn't solve your problem, there might be other factors at play that we'd need to consider.

If you find this solution helpful and believe it might be useful to others, I encourage you to make a pull request to update the LangChain documentation. This way, other users facing the same issue can benefit from your experience.

Sources

libs/langchain/langchain/cache.py

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

0 replies

tpanda09 · 2023-07-23T15:30:08Z

tpanda09
Jul 23, 2023
Author

Thanks for replying. But how can I get the index cache object in codes ? There is no instruction. 在 2023-07-23 22:39:49，"dosu-beta[bot]" ***@***.***> 写道： Answer generated by a Answer I understand that you're having trouble with updating the cache in LangChain after updating the database. You've correctly identified that the cache needs to be refreshed to ensure data consistency. You've also noticed that the index object, which is obtained from the cache according to the collection ID, is not updated when you update the database. To address this issue, you can use the update method provided by the cache classes in LangChain. This method is designed to update the cache with new data. Here's how you can use it in your case: # update the content from local file after loading and splittingcollection_filter=db_gpt.get(where={"source": file_path}) db_gpt.delete(collection_filter['ids']) db_gpt.add_documents(split_documents) # db persistancedb_gpt.persist() # update the cachecache.update(prompt, llm_string, return_val) # search sthdocs_gpt=db_gpt.similarity_search("****") In this code, prompt is the query you want to search, llm_string is the language model version and settings, and return_val is the result you want to cache. You'll need to replace these placeholders with your actual values. This solution should work regardless of the cache type you're using, as the update method is available in all cache classes (InMemoryCache, SQLAlchemyCache, SQLiteCache). Please note that this is one potential solution based on the information you've provided. If you encounter any issues or if this doesn't solve your problem, there might be other factors at play that we'd need to consider. If you find this solution helpful and believe it might be useful to others, I encourage you to make a pull request to update the LangChain documentation. This way, other users facing the same issue can benefit from your experience. Sources libs/langchain/langchain/cache.py This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a if this is helpful and if it is irrelevant. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to update chroma's index cache #8147

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to update chroma's index cache #8147

Uh oh!

tpanda09 Jul 23, 2023

update the content from local file after loading and splitting

db persistance

search sth

Replies: 2 comments

Uh oh!

dosubot[bot] bot Jul 23, 2023

Answer

Sources

Uh oh!

tpanda09 Jul 23, 2023 Author

tpanda09
Jul 23, 2023

dosubot[bot]
bot Jul 23, 2023

tpanda09
Jul 23, 2023
Author