Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,82 @@ for chunk in stream:
print(chunk['message']['content'], end='', flush=True)
```

## Cloud Models

Run larger models by offloading to Ollama’s cloud while keeping your local workflow.

- Supported models: `deepseek-v3.1:671b-cloud`, `gpt-oss:20b-cloud`, `gpt-oss:120b-cloud`, `kimi-k2:1t-cloud`, `qwen3-coder:480b-cloud`

### Run via local Ollama

1) Sign in (one-time):

```
ollama signin
```

2) Pull a cloud model:

```
ollama pull gpt-oss:120b-cloud
```

3) Use as usual (offloads automatically):

```python
from ollama import Client

client = Client()

messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]

for part in client.chat('gpt-oss:120b-cloud', messages=messages, stream=True):
print(part['message']['content'], end='', flush=True)
```

### Cloud API (ollama.com)

Access cloud models directly by pointing the client at `https://ollama.com`.

1) Create an API key, then set:

```
export OLLAMA_API_KEY=your_api_key
```

2) (Optional) List models available via the API:

```
curl https://ollama.com/api/tags
```

3) Generate a response via the cloud API:

```python
import os
from ollama import Client

client = Client(
host='https://ollama.com',
headers={'Authorization': 'Bearer ' + os.environ.get('OLLAMA_API_KEY')}
)

messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]

for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
print(part['message']['content'], end='', flush=True)
```

## Custom client
A custom client can be created by instantiating `Client` or `AsyncClient` from `ollama`.

Expand Down