Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions pages/generative-apis/troubleshooting/fixing-common-issues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,26 @@ Below are common issues that you may encounter when using Generative APIs, their
- When displaying the Cockpit of a specific Project, but waiting for average token consumption to display:
- Counter for **Tokens Processed** or **API Requests** should display a correct value (different from 0)
- Graph across time should be empty
```

## Embeddings vectors cannot be stored in database or used with a third-party library

### Cause
- The embedding model you are using generates vector representations of fixed dimensions number, which is too high for your database or third-party library.
- For example, the embedding model `bge-multilingual-gemma2` generates vector representations with `3584` dimensions. However, when storing vectors using PostgreSQL `pgvector` extensions, indexes (in `hnsw` or `ivvflat` formats) only support up to `2000` dimensions.

### Solution
- Use a vector store supporting higher dimensions number, such as [Qdrant](https://www.scaleway.com/en/docs/tutorials/deploying-qdrant-vectordb-kubernetes/).
- Do not use indexes for vectors or disable them from your third party library. This may limit performance in vector similarity search for significant volumes.
- When using [Langchain PGVector method](https://python.langchain.com/docs/integrations/vectorstores/pgvector/), this method does not create index by default, and should not raise errors.
- When using [Mastra](https://mastra.ai/) library with `vectorStoreName: "pgvector"`, specify indexConfig type as `flat` to avoid creating any index on vector dimensions.
```typescript
await vectorStore.createIndex({
indexName: 'papers',
dimension: 3584,
indexConfig: {"type":"flat"},
});
```
- Use a model with a lower number of dimensions. Using [Managed Inference](https://console.scaleway.com/inference/deployments), you can deploy for instance `sentence-t5-xxl` model which represent vectors with `768` dimensions.

## Best practices for optimizing model performance

Expand All @@ -135,4 +154,4 @@ Below are common issues that you may encounter when using Generative APIs, their
### Debugging silent errors
- For cases where no explicit error is returned:
- Verify all fields in the API request are correctly named and formatted.
- Test the request with smaller and simpler inputs to isolate potential issues.
- Test the request with smaller and simpler inputs to isolate potential issues.