How to load pre-generated embedding files? #685
Unanswered
rustam-ashurov-mcx
asked this question in
1. Q&A
Replies: 1 comment
-
hi @rustam-ashurov-mcx, what you're describing is very similar to a cache, which would require storing pairs of text and vectors, allowing to search by text. You could implement a cache dependency and inject into embedding generators, using a lookup table. Or you could develop a custom embedding generator, that never talks to OpenAI, and always loads embeddings from storage, taking care of all the cases, e.g. distributed storage, adding new embeddings, etc. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey mates, have a hard time with SK + KM (mostly because of lack of exp of working with both yet) 😅
My aim is to generate embeddings in advance as files, and on service startup upload them in in-memory store without calling AI provider again and again on each restart.
So I generated embeddings via OpenAI client (I was unable to find whether I can generate them via SK/KM and store as files), and now they are stored as *.json files with such (example) content:
"{
"data": [
{
"embedding": [
0.006308248266577721,
....lot of numbers here...
],
"index": 0,
"object": "embedding"
}
],
"model": "text-embedding-ada-002",
"object": "list",
"usage": {
"prompt_tokens": 305,
"total_tokens": 305
}
}"
What I can not understand is how can this data be loaded into KM?
Beta Was this translation helpful? Give feedback.
All reactions