diff --git a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx index 2240ad6c8b..0b62e9dc69 100644 --- a/pages/generative-apis/troubleshooting/fixing-common-issues.mdx +++ b/pages/generative-apis/troubleshooting/fixing-common-issues.mdx @@ -120,7 +120,26 @@ Below are common issues that you may encounter when using Generative APIs, their - When displaying the Cockpit of a specific Project, but waiting for average token consumption to display: - Counter for **Tokens Processed** or **API Requests** should display a correct value (different from 0) - Graph across time should be empty -``` + +## Embeddings vectors cannot be stored in database or used with a third-party library + +### Cause +The embedding model you are using generates vector representations with a fixed dimension number, which is too high for your database or third-party library. + - For example, the embedding model `bge-multilingual-gemma2` generates vector representations with `3584` dimensions. However, when storing vectors using PostgreSQL `pgvector` extensions, indexes (in `hnsw` or `ivvflat` formats) only support up to `2000` dimensions. + +### Solution +- Use a vector store supporting higher dimensions number, such as [Qdrant](https://www.scaleway.com/en/docs/tutorials/deploying-qdrant-vectordb-kubernetes/). +- Do not use indexes for vectors or disable them from your third-party library. This may limit performance in vector similarity search for significant volumes. + - When using [Langchain PGVector method](https://python.langchain.com/docs/integrations/vectorstores/pgvector/), this method does not create an index by default and should not raise errors. + - When using the [Mastra](https://mastra.ai/) library with `vectorStoreName: "pgvector"`, specify indexConfig type as `flat` to avoid creating any index on vector dimensions. + ```typescript + await vectorStore.createIndex({ + indexName: 'papers', + dimension: 3584, + indexConfig: {"type":"flat"}, + }); + ``` +- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance the`sentence-t5-xxl` model, which represents vectors with `768` dimensions. ## Best practices for optimizing model performance @@ -135,4 +154,4 @@ Below are common issues that you may encounter when using Generative APIs, their ### Debugging silent errors - For cases where no explicit error is returned: - Verify all fields in the API request are correctly named and formatted. - - Test the request with smaller and simpler inputs to isolate potential issues. \ No newline at end of file + - Test the request with smaller and simpler inputs to isolate potential issues.