docs: choosing an embedding model

giladgd · giladgd · commit ab2f52a28f95 · 2024-12-07T21:47:18.000+02:00
diff --git a/docs/guide/choosing-a-model.md b/docs/guide/choosing-a-model.md
@@ -83,7 +83,7 @@ npx --no node-llama-cpp inspect estimate <model-file-url>
 ```
 :::
 
-### What do you need this model for? (chat, code completion, analyzing data, classification, etc.) {#model-purpose}
+### What do you need this model for? (chat, code completion, analyzing data, classification, embedding, etc.) {#model-purpose}
 There are plenty of models with different areas of expertise and capabilities.
 
 When you choose a model that is more specialized in the task you need it for, it will usually perform better than a general model.
@@ -111,6 +111,18 @@ Here are a few concepts to be aware of when choosing a model:
   you can either recognize the foundational model name and then assume that the rest is a fine-tune name,
   or you can open the model's page and read the model description.
 
+* **Embedding models** - models that are trained to convert text into [embeddings](./embedding.md) that capture the semantic meaning of the text.
+
+  Generating embeddings for similarity search using such models is preferable
+  because they are highly optimized for this task.
+  Embedding models are often significantly smaller (sometimes as small as 100MB), faster,
+  and consume less memory than general-purpose models, making them more efficient and practical.
+
+  While general-purpose models can also be used for generating embeddings,
+  they may not be as optimized or as efficient as embedding models for this task.
+  
+  Many embedding models include terms like `embed` in their name.
+
 ### How much data do you plan to feed the model at once with?
 If you plan to feed the model with a lot of data at once, you'll need a model that supports a large context size.
 The larger the context size is, the more data the model can process at once.
diff --git a/docs/guide/embedding.md b/docs/guide/embedding.md
@@ -23,6 +23,8 @@ Instead, we can embed all the documents once and then search for the most simila
 To do that, we embed all the documents in advance and store the embeddings in a database.
 Then, when a query comes in, we embed the query and search for the most similar embeddings in the database, and return the corresponding documents.
 
+Read the [choosing a model tutorial](./choosing-a-model.md) to learn how to choose the right model for your use case.
+
 ## Finding Relevant Documents
 Let's see an example of how we can embed 10 texts and then search for the most relevant one to a given query:
 ::: warning NOTE