Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions distilbert/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# DistilBERT
This truss runs the [DistilBERT](https://huggingface.co/docs/transformers/en/model_doc/distilbert) model as an endpoint on Baseten.

## Deploy
```
pip install --upgrade truss
truss push --publish # grab an api key from https://app.baseten.co/settings/api_keys
```

The deployment will take a few minutes the first. Once it's ready in the you UI you can proceed to calling the API.

## Test
```
truss predict --published -d '{"text": "some text to embed"}'
```
6 changes: 6 additions & 0 deletions distilbert/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

model_name: DistilBert
python_version: py310
requirements_file: ./requirements.txt
resources:
accelerator: T4
Empty file added distilbert/model/__init__.py
Empty file.
30 changes: 30 additions & 0 deletions distilbert/model/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import torch
from transformers import AutoTokenizer, AutoModel


class Model:
def __init__(self, **kwargs):
self._model = None

def load(self):
# Load model here and assign to self._model.
self.device = (
"cuda" if torch.cuda.is_available() else "cpu"
) # the device to load the model onto

self._tokenizer = AutoTokenizer.from_pretrained(
"distilbert/distilbert-base-uncased", device=self.device
)
self._model = AutoModel.from_pretrained(
"distilbert/distilbert-base-uncased",
torch_dtype=torch.float16,
).to(self.device)

def predict(self, model_input):
# Run model inference here

text = model_input.get("text")

encoded_input = self._tokenizer(text, return_tensors='pt').to(self.device)

return self._model(**encoded_input).last_hidden_state.tolist()
3 changes: 3 additions & 0 deletions distilbert/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
hf-transfer==0.1.6
torch==2.2.2
transformers==4.40.0