Skip to content

Commit eecbade

Browse files
committed
[Workers AI] Pricing for deepseek r1 distill
1 parent 251b2bc commit eecbade

File tree

2 files changed

+36
-20
lines changed

2 files changed

+36
-20
lines changed

src/content/docs/workers-ai/platform/pricing.mdx

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,14 @@ pcx_content_type: concept
33
title: Pricing
44
sidebar:
55
order: 1
6-
76
---
87

98
:::note
109

11-
12-
Workers AI has deprecated the usage of neurons in favor of unit-based pricing. The Cloudflare dashboards will be migrated this unit-based pricing soon so you can track your usage. Individual model pages will soon document the price for each model. We also made pricing cheaper!
10+
Workers AI has deprecated the usage of neurons in favor of unit-based pricing. The Cloudflare dashboards will be migrated this unit-based pricing soon so you can track your usage. Individual model pages will soon document the price for each model. We also made pricing cheaper!
1311

1412
We will begin billing for all models under this new pricing structure beginning November 1, 2024.
1513

16-
1714
:::
1815

1916
Workers AI is included in both the [Free and Paid Workers plans](/workers/platform/pricing/) and is priced based on model task, model size, and units.
@@ -24,20 +21,24 @@ These docs will be updated as we add new pricing for new task types in our model
2421

2522
## Pricing Structure
2623

24+
Some models may have specific pricing. For specific details, check the page of the [specific model](/workers-ai/models/).
25+
2726
### Text Generation LLMs (incl Vision models)
27+
2828
Model size is measured in parameters.
2929
Pricing is based on blended tokens (input + output).
3030
Vision models will convert the image input into tokens for billing. Depending on size and aspect ratio, images will be charged for between 1,601 and 6,404 tokens. Most images that are more that 224 pixels wide or tall will be charged as 6,404 tokens each.
3131

32-
| Model Size | Pricing |
33-
| ---------------- | ------------------------ |
34-
| \<= 3B | $0.10 per Million Tokens |
35-
| 3.1B - 8B | $0.15 per Million Tokens |
36-
| 8.1B - 20B | $0.20 per Million Tokens |
37-
| 20.1B - 40B | $0.50 per Million Tokens |
38-
| 40.1B+ | $0.75 per Million Tokens |
32+
| Model Size | Pricing |
33+
| ----------- | ------------------------ |
34+
| \<= 3B | $0.10 per Million Tokens |
35+
| 3.1B - 8B | $0.15 per Million Tokens |
36+
| 8.1B - 20B | $0.20 per Million Tokens |
37+
| 20.1B - 40B | $0.50 per Million Tokens |
38+
| 40.1B+ | $0.75 per Million Tokens |
3939

4040
### Embeddings
41+
4142
Model size is measured in parameters.
4243
Pricing is based on input tokens.
4344

@@ -47,30 +48,30 @@ Pricing is based on input tokens.
4748
| 151M+ parameters | $0.015 per Million Tokens |
4849

4950
## Image Generation
51+
5052
Standard models are large image models such as `@cf/stabilityai/stable-diffusion-xl-base-1.0`
5153
Fast models are usually smaller image models that require fewer steps to generate an image, such as `@cf/black-forest-labs/flux-1-schnell` and `@cf/bytedance/stable-diffusion-xl-lightning`
5254
We take the maximum of the image height and width to calculate pricing. For example, an image of 1024x768 would fall under 1024x1024 pricing.
5355

54-
| Image Size | Price |
55-
| ------------ | --------------------- |
56-
| \<=256x256 | $0.00025 per 5 steps |
57-
| \<=512x512 | $0.0005 per 5 steps |
58-
| \<=1024x1024 | $0.001 per 5 steps |
59-
| \<=2048x2048 | $0.002 per 5 steps |
56+
| Image Size | Price |
57+
| ------------ | -------------------- |
58+
| \<=256x256 | $0.00025 per 5 steps |
59+
| \<=512x512 | $0.0005 per 5 steps |
60+
| \<=1024x1024 | $0.001 per 5 steps |
61+
| \<=2048x2048 | $0.002 per 5 steps |
6062

6163
## Speech-to-text
64+
6265
Speech-to-text models like `@cf/openai/whisper` are billed on minutes of audio input.
6366

6467
| Price |
6568
| --------------------------------- |
6669
| $0.0039 per minute of audio input |
6770

68-
6971
## Free Allocation
7072

7173
Our free allocation allows anyone to use Workers AI up to a certain limit per day. To use more than the free allocation, upgrade to the Workers Paid plan, where you will be charged on any usage above the free tier based on the pricing structure above.
7274

73-
7475
| Model | Free tier size |
7576
| --------------------- | -------------------------------------------- |
7677
| Text Generation - LLM | 10,000 tokens a day across any model size |
@@ -80,7 +81,6 @@ Our free allocation allows anyone to use Workers AI up to a certain limit per da
8081

8182
All limits reset daily at 00:00 UTC. If you exceed any one of the above limits, further operations will fail with an error.
8283

83-
8484
## Archived Pricing
8585

8686
Workers AI was previously metered by Neurons. We deprecated this in favor of unit-based pricing on September 26, 2024. We wanted to make it simple for people to compare and contrast Workers AI with other providers, and we also generally updated pricing to be cheaper with these new units.

src/pages/workers-ai/models/[name].astro

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,22 @@ const starlightPageProps = {
152152
{terms && <a href={terms.value}>Terms and License</a>}
153153
<ModelBadges model={model} />
154154

155+
{
156+
model.name === "@cf/deepseek-ai/deepseek-r1-distill-qwen-32b" && (
157+
<Aside>
158+
<p>
159+
The pricing for this model differs from the{" "}
160+
<a href="/workers-ai/platform/pricing/">normal pricing</a> for Workers
161+
AI. The pricing for
162+
<code>@cf/deepseek-ai/deepseek-r1-distill-qwen-32b</code> is:
163+
<ul>
164+
<li>$0.49686894 per 1M input token</li>
165+
<li>$4.88132077 per 1M output token</li>
166+
</ul>
167+
</p>
168+
</Aside>
169+
)
170+
}
155171
{
156172
model.name === "@cf/meta/llama-3.2-11b-vision-instruct" && (
157173
<Aside>

0 commit comments

Comments
 (0)