You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+111Lines changed: 111 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,6 +43,89 @@ we recommend using [python-dotenv](https://pypi.org/project/python-dotenv/)
43
43
to add `REPLICATE_API_TOKEN="My Bearer Token"` to your `.env` file
44
44
so that your Bearer Token is not stored in source control.
45
45
46
+
## Run a model
47
+
48
+
You can run a model synchronously using `replicate.run()`:
49
+
50
+
```python
51
+
import replicate
52
+
53
+
output = replicate.run(
54
+
"black-forest-labs/flux-schnell", input={"prompt": "astronaut riding a rocket like a horse"}
55
+
)
56
+
print(output)
57
+
```
58
+
59
+
The `run()` method is a convenience function that creates a prediction, waits for it to complete, and returns the output. If you want more control over the prediction process, you can use the lower-level API methods.
60
+
61
+
### Handling errors
62
+
63
+
`replicate.run()` raises `ModelError` if the prediction fails. You can catch this exception to handle errors gracefully:
64
+
65
+
```python
66
+
import replicate
67
+
from replicate.exceptions import ModelError
68
+
69
+
try:
70
+
output = replicate.run(
71
+
"stability-ai/stable-diffusion-3", input={"prompt": "An astronaut riding a rainbow unicorn"}
72
+
)
73
+
except ModelError as e:
74
+
print(f"Prediction failed: {e}")
75
+
# The prediction object is available as e.prediction
76
+
print(f"Prediction ID: {e.prediction.id}")
77
+
print(f"Status: {e.prediction.status}")
78
+
```
79
+
80
+
### File inputs
81
+
82
+
To run a model that takes file inputs, you can pass either a URL to a publicly accessible file or a file handle:
When `wait=False`, the method returns immediately after creating the prediction, and you'll need to poll for the result manually.
112
+
113
+
## Run a model and stream its output
114
+
115
+
For models that support streaming (particularly language models), you can use `replicate.stream()`:
116
+
117
+
```python
118
+
import replicate
119
+
120
+
for event in replicate.stream(
121
+
"meta/meta-llama-3-70b-instruct",
122
+
input={
123
+
"prompt": "Please write a haiku about llamas.",
124
+
},
125
+
):
126
+
print(str(event), end="")
127
+
```
128
+
46
129
## Async usage
47
130
48
131
Simply import `AsyncReplicate` instead of `Replicate` and use `await` with each API call:
@@ -69,6 +152,34 @@ asyncio.run(main())
69
152
70
153
Functionality between the synchronous and asynchronous clients is otherwise identical.
71
154
155
+
### Async run() and stream()
156
+
157
+
The async client also supports `run()` and `stream()` methods:
158
+
159
+
```python
160
+
import asyncio
161
+
from replicate import AsyncReplicate
162
+
163
+
replicate = AsyncReplicate()
164
+
165
+
166
+
asyncdefmain():
167
+
# Run a model
168
+
output =await replicate.run(
169
+
"black-forest-labs/flux-schnell", input={"prompt": "astronaut riding a rocket like a horse"}
170
+
)
171
+
print(output)
172
+
173
+
# Stream a model's output
174
+
asyncfor event in replicate.stream(
175
+
"meta/meta-llama-3-70b-instruct", input={"prompt": "Write a haiku about coding"}
176
+
):
177
+
print(str(event), end="")
178
+
179
+
180
+
asyncio.run(main())
181
+
```
182
+
72
183
### With aiohttp
73
184
74
185
By default, the async client uses `httpx` for HTTP requests. However, for improved concurrency performance you may also use `aiohttp` as the HTTP backend.
0 commit comments