Skip to content

Commit f6a559f

Browse files
committed
Generated markdown tutorials from Jupyter Notebooks
Generated from: couchbase-examples/vector-search-cookbook
1 parent a9a4688 commit f6a559f

File tree

1 file changed

+31
-8
lines changed
  • tutorial/markdown/generated/vector-search-cookbook

1 file changed

+31
-8
lines changed

tutorial/markdown/generated/vector-search-cookbook/mistralai.md

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,32 @@ length: 30 Mins
2525

2626
[View Source](https://github.com/couchbase-examples/vector-search-cookbook/tree/main/mistralai/mistralai.ipynb)
2727

28+
# Couchbase and Mistral AI integration example notebook
29+
Couchbase is a NoSQL distributed document database (JSON) with many of the best features of a relational DBMS: SQL, distributed ACID transactions, and much more. [Couchbase Capella™](https://cloud.couchbase.com/sign-up) is the easiest way to get started, but you can also download and run [Couchbase Server](http://couchbase.com/downloads) on-premises.
30+
31+
Mistral AI is a research lab building the best open source models in the world. La Plateforme enables developers and enterprises to build new products and applications, powered by Mistral’s open source and commercial LLMs.
32+
33+
The [Mistral AI APIs](https://console.mistral.ai/) empower LLM applications via:
34+
35+
- [Text generation](https://docs.mistral.ai/capabilities/completion/), enables streaming and provides the ability to display partial model results in real-time
36+
- [Code generation](https://docs.mistral.ai/capabilities/code_generation/), enpowers code generation tasks, including fill-in-the-middle and code completion
37+
- [Embeddings](https://docs.mistral.ai/capabilities/embeddings/), useful for RAG where it represents the meaning of text as a list of numbers
38+
- [Function calling](https://docs.mistral.ai/capabilities/function_calling/), enables Mistral models to connect to external tools
39+
- [Fine-tuning](https://docs.mistral.ai/capabilities/finetuning/), enables developers to create customized and specilized models
40+
- [JSON mode](https://docs.mistral.ai/capabilities/json_mode/), enables developers to set the response format to json_object
41+
- [Guardrailing](https://docs.mistral.ai/capabilities/guardrailing/), enables developers to enforce policies at the system level of Mistral models
42+
43+
2844
# Prerequisites
29-
In order to run this tutorial, you will need access to a collection on a Couchbase Cluster either through Couchbase Capella or by running it locally.
45+
## Python3 and PIP
46+
Please consult with [pip installation documentation](https://pip.pypa.io/en/stable/installation/) to install pip.
47+
## Dependency Libraries
48+
This tutorial depends on `couchbase` and `mistralai` libraries. Run this shell command to install them:
49+
```shell
50+
pip install -r requirements.txt
51+
```
52+
## Couchbase Cluster
53+
In order to run this tutorial, you will need access to a collection on a Couchbase Cluster either through Couchbase Capella or by running it locally. Please provide your couchbase cluster connection information by running the code block below:
3054

3155

3256
```python
@@ -85,7 +109,7 @@ collection = scope.collection(couchbase_collection)
85109
```
86110

87111
## Creating Couchbase Vector Search Index
88-
In order to store Mistral embeddings onto a Couchbase Cluster, a vector search index needs to be created first. We included a sample index definition that will work with this tutorial. The definition can be used to create a vector index using Couchbase server web console, on more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html).
112+
In order to store Mistral embeddings onto a Couchbase Cluster, a vector search index needs to be created first. We included a sample index definition that will work with this tutorial in the `fts_index.json` file. The definition can be used to create a vector index using Couchbase server web console, on more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html).
89113

90114

91115
```python
@@ -94,7 +118,7 @@ search_index = cluster.search_indexes().get_index(search_index_name)
94118
```
95119

96120
## Mistral Connection
97-
A Mistral API key needs to be obtained and configured in the code before using the Mistral API. The key can be obtained in MistralAI personal cabinet, for more detailed instructions please consult with [Mistral documentation site](https://docs.mistral.ai/).
121+
A Mistral API key needs to be obtained and configured in the code before using the Mistral API. A trial key can be obtained for free in MistralAI personal cabinet. For more detailed instructions on obtaining a key please consult with [Mistral documentation site](https://docs.mistral.ai/).
98122

99123

100124
```python
@@ -103,13 +127,14 @@ mistral_client = Mistral(api_key=MISTRAL_API_KEY)
103127
```
104128

105129
## Embedding Documents
106-
Mistral client can be used to generate vector embeddings for given text fragments. These embeddings represent the sentiment of corresponding fragments and can be stored in Couchbase for further retrieval.
130+
Mistral client can be used to generate vector embeddings for given text fragments. These embeddings represent the sentiment of corresponding fragments and can be stored in Couchbase for further retrieval. A custom embedding text can also be added into the embedding texts array by running this code block:
107131

108132

109133
```python
110134
texts = [
111135
"Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational databases such as SQL and ACID transactions with JSON’s versatility, with a foundation that is extremely fast and scalable.",
112-
"It’s used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vector search, high-speed caching, and much more."
136+
"It’s used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vector search, high-speed caching, and much more.",
137+
input("custom embedding text")
113138
]
114139
embeddings = mistral_client.embeddings.create(
115140
model="mistral-embed",
@@ -119,16 +144,14 @@ embeddings = mistral_client.embeddings.create(
119144
print("Output embeddings: " + str(len(embeddings.data)))
120145
```
121146

122-
Output embeddings: 2
123-
124-
125147
The output `embeddings` is an EmbeddingResponse object with the embeddings and the token usage information:
126148

127149
```
128150
EmbeddingResponse(
129151
id='eb4c2c739780415bb3af4e47580318cc', object='list', data=[
130152
Data(object='embedding', embedding=[-0.0165863037109375,...], index=0),
131153
Data(object='embedding', embedding=[-0.0234222412109375,...], index=1)],
154+
Data(object='embedding', embedding=[-0.0466222735279375,...], index=2)],
132155
model='mistral-embed', usage=EmbeddingResponseUsage(prompt_tokens=15, total_tokens=15)
133156
)
134157
```

0 commit comments

Comments
 (0)