update

mchenco · mchenco · commit e0274a890d01 · 2025-02-19T19:49:17.000-05:00
diff --git a/src/content/docs/workers-ai/platform/pricing.mdx b/src/content/docs/workers-ai/platform/pricing.mdx
@@ -7,80 +7,55 @@ sidebar:
 
 :::note
 
-Workers AI has deprecated the usage of neurons in favor of unit-based pricing. The Cloudflare dashboards will be migrated this unit-based pricing soon so you can track your usage. Individual model pages will soon document the price for each model. We also made pricing cheaper!
-
-We will begin billing for all models under this new pricing structure beginning November 1, 2024.
+Workers AI has updated pricing to be more granular, with unit-based pricing but still billing in neurons in the back end. (blog post to come). 
 
 :::
 
-Workers AI is included in both the [Free and Paid Workers plans](/workers/platform/pricing/) and is priced based on model task, model size, and units.
-
-Individual model pages will have the pricing listed on them, but the general pricing structure across our models is laid out below.
-
-These docs will be updated as we add new pricing for new task types in our model catalog.
-
-## Pricing Structure
-
-Some models may have specific pricing. For specific details, check the page of the [specific model](/workers-ai/models/).
-
-### Text Generation LLMs (incl Vision models)
-
-Model size is measured in parameters.
-Pricing is based on blended tokens (input + output).
-Vision models will convert the image input into tokens for billing. Depending on size and aspect ratio, images will be charged for between 1,601 and 6,404 tokens. Most images that are more that 224 pixels wide or tall will be charged as 6,404 tokens each.
-
-| Model Size  | Pricing                  |
-| ----------- | ------------------------ |
-| \<= 3B      | $0.10 per Million Tokens |
-| 3.1B - 8B   | $0.15 per Million Tokens |
-| 8.1B - 20B  | $0.20 per Million Tokens |
-| 20.1B - 40B | $0.50 per Million Tokens |
-| 40.1B+      | $0.75 per Million Tokens |
-
-### Embeddings
+Workers AI is included in both the [Free and Paid Workers plans](/workers/platform/pricing/) and is priced at **$0.011 / 1,000 Regular Twitch Neurons** (also known as Neurons).
 
-Model size is measured in parameters.
-Pricing is based on input tokens.
+Our free allocation allows anyone to use a total of **10,000 Neurons per day at no charge.
 
-| Model Size          | Pricing                   |
-| ------------------- | ------------------------- |
-| \<= 150M parameters | $0.008 per Million Tokens |
-| 151M+ parameters    | $0.015 per Million Tokens |
+To use more than 10,000 Neurons per day, you need to sign up for the [Workers Paid plan](/workers/platform/pricing/#workers). On Workers Paid, you will be charged at $0.011 / 1,000 Neurons for any usage above the free allocation of 10,000 Neurons per day.
 
-## Image Generation
-
-Standard models are large image models such as `@cf/stabilityai/stable-diffusion-xl-base-1.0`
-Fast models are usually smaller image models that require fewer steps to generate an image, such as `@cf/black-forest-labs/flux-1-schnell` and `@cf/bytedance/stable-diffusion-xl-lightning`
-We take the maximum of the image height and width to calculate pricing. For example, an image of 1024x768 would fall under 1024x1024 pricing.
-
-| Image Size   | Price                |
-| ------------ | -------------------- |
-| \<=256x256   | $0.00025 per 5 steps |
-| \<=512x512   | $0.0005 per 5 steps  |
-| \<=1024x1024 | $0.001 per 5 steps   |
-| \<=2048x2048 | $0.002 per 5 steps   |
-
-## Speech-to-text
-
-Speech-to-text models like `@cf/openai/whisper` are billed on minutes of audio input.
-
-| Price                             |
-| --------------------------------- |
-| $0.0039 per minute of audio input |
-
-## Free Allocation
-
-Our free allocation allows anyone to use Workers AI up to a certain limit per day. To use more than the free allocation, upgrade to the Workers Paid plan, where you will be charged on any usage above the free tier based on the pricing structure above.
-
-| Model                 | Free tier size                               |
-| --------------------- | -------------------------------------------- |
-| Text Generation - LLM | 10,000 tokens a day across any model size    |
-| Embeddings            | 10,000 tokens a day across any model size    |
-| Images                | Sum of 250 steps, up to 1024x1024 resolution |
-| Speech-to-text        | 10 minutes of audio a day                    |
+You can monitor your Neuron usage in the [Cloudflare Workers AI dashboard](https://dash.cloudflare.com/?to=/:account/ai/workers-ai). To estimate Neurons and costs, use the [pricing calculator](https://ai.cloudflare.com/#pricing-calculator).
 
 All limits reset daily at 00:00 UTC. If you exceed any one of the above limits, further operations will fail with an error.
 
-## Archived Pricing
-
-Workers AI was previously metered by Neurons. We deprecated this in favor of unit-based pricing on September 26, 2024. We wanted to make it simple for people to compare and contrast Workers AI with other providers, and we also generally updated pricing to be cheaper with these new units.
+|              | Free <br/> allocation  | Overage<br/>pricing           |
+| ------------ | ---------------------- | ----------------------------- |
+| Workers Free | 10,000 Neurons per day | N/A - Upgrade to Workers Paid |
+| Workers Paid | 10,000 Neurons per day | $0.011 / 1,000 Neurons        |
+
+## What are Neurons?
+
+Neurons are our way of measuring AI outputs across different models. To give you a sense of what you can accomplish with 10,000 Neurons, you can: generate 100-200 LLM responses, 500 translations, 500 seconds of speech-to-text audio, 10,000 text classifications, or 1,500 - 15,000 embeddings depending on which models you use. Our serverless model allows you to pay only for what you use without having to worry about renting, managing, or scaling GPUs.
+
+## LLM model pricing
+
+| Model                                        | Price in tokens                                           | Price in Neurons                                                       |
+| -------------------------------------------- | --------------------------------------------------------- | ---------------------------------------------------------------------- |
+| @cf/meta/llama-3.2-1b-instruct               | $0.027 per M input tokens <br> $0.201 per M output tokens | 2457 neurons per M input tokens <br> 18252 neurons per M output tokens |
+| @cf/meta/llama-3.2-3b-instruct               | $0.051 per M input tokens <br> $0.335 per M output tokens | 4625 per M input tokens <br> 30475 per M output tokens                 |
+| @cf/meta/llama-3.1-8b-instruct-fp8-fast      | $0.045 per M input tokens <br> $0.384 per M output tokens | 4119 per M input tokens <br> 34868 per M output tokens                 |
+| @cf/meta/llama-3.2-11b-vision-instruct       | $0.049 per M input tokens <br> $0.676 per M output tokens | 4410 per M input tokens <br> 61493 per M output tokens                 |
+| @cf/meta/llama-3.1-70b-instruct-fp8-fast     | $0.293 per M input tokens <br> $2.253 per M output tokens | 26668 per M input tokens <br> 204805 per M output tokens               |
+| @cf/meta/llama-3.3-70b-instruct-fp8-fast     | $0.293 per M input tokens <br> $2.253 per M output tokens | 26668 per M input tokens <br> 204805 per M output tokens               |
+| @cf/deepseek-ai/deepseek-r1-distill-qwen-32b | $0.497 per M input tokens <br> $4.881 per M output tokens | 45170 per M input tokens <br> 443756 per M output tokens               |
+| @cf/mistral/mistral-7b-instruct-v0.1         | $0.110 per M input tokens <br> $0.190 per M output tokens | 10000 per M input tokens <br> 17300 per M output tokens                |
+| @cf/meta/llama-3.1-8b-instruct               | $0.282 per M input tokens <br> $0.827 per M output tokens | 25608 per M input tokens <br> 75147 per M output tokens                |
+| @cf/meta/llama-3.1-8b-instruct-fp8           | $0.152 per M input tokens <br> $0.287 per M output tokens | 13778 per M input tokens <br> 26128 per M output tokens                |
+| @cf/meta/llama-3.1-8b-instruct-awq           | $0.123 per M input tokens <br> $0.266 per M output tokens | 11161 per M input tokens <br> 24215 per M output tokens                |
+| @cf/meta/llama-3-8b-instruct                 | $0.282 per M input tokens <br> $0.827 per M output tokens | 25608 per M input tokens <br> 75147 per M output tokens                |
+| @cf/meta/llama-3-8b-instruct-awq             | $0.123 per M input tokens <br> $0.266 per M output tokens | 11161 per M input tokens <br> 24215 per M output tokens                |
+| @cf/meta/llama-2-7b-chat-fp16                | $0.556 per M input tokens <br> $6.667 per M output tokens | 50505 per M input tokens <br> 606061 per M output tokens               |
+
+## Other model pricing
+| @cf/black-forest-labs/flux-1-schnell  | $0.0000528 per 512x512 tile <br> $0.0001056 per step      | 4.80 per 512x512 tile <br> 9.60 per step                |
+| ------------------------------------- | --------------------------------------------------------- | ------------------------------------------------------- |
+| @cf/huggingface/distilbert-sst-2-int8 | $0.026 per M input tokens                                 | 2394 per M input tokens                                 |
+| @cf/baai/bge-small-en-v1.5            | $0.020 per M input tokens                                 | 1841 per M input tokens                                 |
+| @cf/baai/bge-base-en-v1.5             | $0.067 per M input tokens                                 | 6058 per M input tokens                                 |
+| @cf/baai/bge-large-en-v1.5            | $0.204 per M input tokens                                 | 18582 per M input tokens                                |
+| @cf/meta/m2m100-1.2b                  | $0.342 per M input tokens <br> $0.342 per M output tokens | 31050 per M input tokens <br> 31050 per M output tokens |
+| @cf/microsoft/resnet-50               | $2.509 per image                                          | 00.23 per image                                         |
+| @cf/openai/whisper                    | $0.0005 per audio minute                                  | 41.14 per audio minute                                  |