Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions hosted-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,43 @@
title: Hosted Models
description: ""
---

With Hypermode, you can simply pick a model from [Hugging Face](https://huggingface.co/), and we will manage and run it for you.

Check warning on line 6 in hosted-models.mdx

View workflow job for this annotation

GitHub Actions / Vale Lanugage Review

[vale] reported by reviewdog 🐶 [Google.Will] Avoid using 'will'. Raw Output: {"message": "[Google.Will] Avoid using 'will'.", "location": {"path": "hosted-models.mdx", "range": {"start": {"line": 6, "column": 98}}}, "severity": "WARNING"}

<Note>Here (TODO) are the list of Hugging Face models we support today</Note>

## Setup

To use a Hypermode-hosted model, set `host: "hypermode"`, `provider: "hugging-face"`, and set `sourceModel` to be
the model name as specified on Hugging Face.

```json hypermode.json
{
...
"models": {
"text-generator": {
"sourceModel": "meta-llama/Llama-3.1-8B-Instruct",
"provider": "hugging-face",
"host": "hypermode"
}
}
...
}
```

## Mode of deployment

Internally, we run our most popular models as multi-tenant, shared among different users.

By default, if the model you use is available as a shared model, your inferences will run against these shared models.

Check warning on line 33 in hosted-models.mdx

View workflow job for this annotation

GitHub Actions / Vale Lanugage Review

[vale] reported by reviewdog 🐶 [Google.Will] Avoid using 'will'. Raw Output: {"message": "[Google.Will] Avoid using 'will'.", "location": {"path": "hosted-models.mdx", "range": {"start": {"line": 33, "column": 82}}}, "severity": "WARNING"}
You can override this default behavior by setting `dedicated: true` on your model in the manifest.

Otherwise, if the model you use isn't available as shared, we will spin up a dedicated instance of the model for you.

Check warning on line 36 in hosted-models.mdx

View workflow job for this annotation

GitHub Actions / Vale Lanugage Review

[vale] reported by reviewdog 🐶 [Google.Will] Avoid using 'will'. Raw Output: {"message": "[Google.Will] Avoid using 'will'.", "location": {"path": "hosted-models.mdx", "range": {"start": {"line": 36, "column": 63}}}, "severity": "WARNING"}

<Note>
These are the models available today as shared:
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)

Check failure on line 40 in hosted-models.mdx

View workflow job for this annotation

GitHub Actions / Vale Lanugage Review

[vale] reported by reviewdog 🐶 [Google.Units] Put a nonbreaking space between the number and the unit in '8B'. Raw Output: {"message": "[Google.Units] Put a nonbreaking space between the number and the unit in '8B'.", "location": {"path": "hosted-models.mdx", "range": {"start": {"line": 40, "column": 32}}}, "severity": "ERROR"}

Check failure on line 40 in hosted-models.mdx

View workflow job for this annotation

GitHub Actions / Vale Lanugage Review

[vale] reported by reviewdog 🐶 [Google.Units] Put a nonbreaking space between the number and the unit in '8B'. Raw Output: {"message": "[Google.Units] Put a nonbreaking space between the number and the unit in '8B'.", "location": {"path": "hosted-models.mdx", "range": {"start": {"line": 40, "column": 89}}}, "severity": "ERROR"}
- [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
- [AntoineMC/distilbart-mnli-github-issues](https://huggingface.co/AntoineMC/distilbart-mnli-github-issues)
- [distilbert/distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english)
</Note>
Loading