Prefixes for embedding models (possibly also others)

### Issue Description

Some embedding models require prefixes for using them, particularly Nomic (which is one of the models on the embedding model list in aichat). For other models it is recommended to use them. However, when setting Nomic as the embedder, no such prefixes are sent, at least according to the --loglevel debug output. It seems that only the chunks are sent for embedding, and only the query for requests later, both without the prefix.

Nomic expects:

"search_document:" as prefix during the creation of embedding vectors, and
"search_query:" during the retrieval process

It also supports "clustering" and "classification" as prefixes.

Not sure how critical this is, but in the documentation on huggingface (https://huggingface.co/nomic-ai/nomic-embed-text-v1.5), it says:


> Important: the text prompt must include a task instruction prefix, instructing the model which task is being performed.
> 
> For example, if you are implementing a RAG application, you embed your documents as search_document: <text here> and embed your user queries as search_query: <text here>.
> 

I classified this as a bug based on that description, but it is certainly not a critically breaking one...




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefixes for embedding models (possibly also others) #39

Issue Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Prefixes for embedding models (possibly also others) #39

Description

Issue Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions