Skip to content

Commit 178430f

Browse files
update doc
1 parent 987bf97 commit 178430f

File tree

1 file changed

+19
-11
lines changed

1 file changed

+19
-11
lines changed

LLM/embedding/deploy-embedding-model-tei.md

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,14 @@ supported by TEI. While TEI offers `/embed` endpoint as default method to get em
1212
will use the OpenAI compatible route, i.e. `/v1/embeddings`. For more details, check the list of endpoints available
1313
[here](https://huggingface.github.io/text-embeddings-inference/#/).
1414

15+
16+
## Overview
17+
This guide demonstrates how to deploy and perform inference of embedding models with Oracle Data Science Service
18+
through a Bring Your Own Container (BYOC) approach. In this example, we will use a model downloaded from
19+
Hugging Face—specifically, `BAAI/bge-base-en-v1.5` from Meta, and the container is powered by Text Embedding Inference (TEI).
20+
21+
22+
1523
## Pre-Requisites
1624
To be able to run the example on this page, ensure you have access to Oracle Data Science notebook in your tenancy.
1725

@@ -28,7 +36,7 @@ This example requires a desktop tool to build, run, launch and push the containe
2836
* [Rancher Desktop](https://rancherdesktop.io/)
2937

3038

31-
# Prepare Inference Container
39+
## Prepare Inference Container
3240
TEI ships with multiple docker images that we can use to deploy an embedding model in OCI Data Science platform.
3341
For more details on images, visit the official Github repository section
3442
[here](https://github.com/huggingface/text-embeddings-inference/tree/main?tab=readme-ov-file#docker-images).
@@ -63,7 +71,7 @@ docker tag ghcr.io/huggingface/text-embeddings-inference:1.5.0 -t <region>.ocir.
6371
docker push <region>.ocir.io/<tenancy>/text-embeddings-inference:1.5.0
6472
```
6573

66-
# Setup
74+
## Setup
6775

6876
Install dependencies in the notebook session. This is needed to prepare the artifacts, create a model
6977
and deploy it in OCI Data Science.
@@ -74,7 +82,7 @@ Run this in the terminal in a notebook session:
7482
pip install oracle-ads oci huggingface_hub -U
7583
```
7684

77-
# Prepare the model artifacts
85+
## Prepare the model artifacts
7886

7987
To prepare model artifacts for deployment:
8088

@@ -104,7 +112,7 @@ Run this in the terminal in a notebook session:
104112
oci os object bulk-upload -bn <bucket> -ns <namespace> --auth resource_principal --prefix BAAI/bge-base-en-v1.5/ --src-dir BAAI/bge-base-en-v1.5/ --no-overwrite
105113
```
106114

107-
# Create Model by reference using ADS
115+
## Create Model by reference using ADS
108116

109117
Create a notebook with the default python kernel where the python library in the setup section.
110118

@@ -140,11 +148,11 @@ model = (DataScienceModel()
140148
).create(model_by_reference=True)
141149
```
142150

143-
# Deploy embedding model
151+
## Deploy embedding model
144152

145153
In order to deploy the model we just created, we set up the infrastructure and container runtime first.
146154

147-
## Import Model Deployment Modules
155+
### Import Model Deployment Modules
148156

149157
```
150158
from ads.model.deployment import (
@@ -155,7 +163,7 @@ from ads.model.deployment import (
155163
)
156164
```
157165

158-
## Setup Model Deployment Infrastructure
166+
### Setup Model Deployment Infrastructure
159167

160168
```
161169
infrastructure = (
@@ -177,7 +185,7 @@ infrastructure = (
177185
)
178186
```
179187

180-
## Configure Model Deployment Runtime
188+
### Configure Model Deployment Runtime
181189

182190
We set the `MODEL_DEPLOY_PREDICT_ENDPOINT` endpoint environment variable with `/v1/embeddings` so that we can
183191
access the corresponding endpoint from the TEI container. One additional configuration we need to add is `cmd_var`, which
@@ -204,7 +212,7 @@ container_runtime = (
204212
)
205213
```
206214

207-
## Deploy Model Using Container Runtime
215+
### Deploy Model Using Container Runtime
208216

209217
Once the infrastructure and runtime is configured, we can deploy the model.
210218
```
@@ -217,7 +225,7 @@ deployment = (
217225
).deploy(wait_for_completion=False)
218226
```
219227

220-
# Inference
228+
## Inference
221229

222230
Once the model deployment has reached the Active state, we can invoke the model deployment endpoint to interact with the LLM.
223231
More details on different ways for accessing MD endpoints is documented [here](https://github.com/oracle-samples/oci-data-science-ai-samples/blob/main/ai-quick-actions/model-deployment-tips.md#inferencing-model).
@@ -262,7 +270,7 @@ The raw output (response) has an array of three lists with embedding for the abo
262270
'usage': {'prompt_tokens': 39, 'total_tokens': 39}}
263271
```
264272

265-
# Testing Embeddings generated by the model
273+
## Testing Embeddings generated by the model
266274

267275
Here, we have 3 sentences - two of which have similar meaning, and the third one is distinct. We'll run a simple test to
268276
find how similar or dissimilar these sentences are, using cosine similarity as a comparison metric.

0 commit comments

Comments
 (0)