|
22 | 22 |
|
23 | 23 | APIs run by the companies that train or fine-tune the models themselves. |
24 | 24 |
|
25 | | -- [Cohere](https://dashboard.cohere.com/api-keys) 🇺🇸 - Command A, Command R+, Aya Expanse 32B +9 more. 20 RPM, 1K/mo. |
26 | | -- [Google Gemini](https://aistudio.google.com/app/apikey) 🇺🇸 - Gemini 2.5 Pro, Flash, Flash-Lite +4 more. 5-15 RPM, 100-1K RPD. [^1] |
27 | | -- [Mistral AI](https://console.mistral.ai/api-keys) 🇪🇺 - Mistral Large 3, Small 3.1, Ministral 8B +3 more. 1 req/s, 1B tok/mo. |
28 | | -- [Zhipu AI](https://open.bigmodel.cn/usercenter/apikeys) 🇨🇳 - GLM-4.7-Flash, GLM-4.5-Flash, GLM-4.6V-Flash. Limits undocumented. |
| 25 | +### [Cohere](https://dashboard.cohere.com/api-keys) 🇨🇦 |
| 26 | + |
| 27 | +Free "Trial" API key, no credit card. 1,000 API calls/month. Non-commercial use only. |
| 28 | + |
| 29 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 30 | +|---|---|---|---|---| |
| 31 | +| Command A (111B) | 256K | 4K | Text | 20 RPM | |
| 32 | +| Command R+ | 128K | 4K | Text | 20 RPM | |
| 33 | +| Command R | 128K | 4K | Text | 20 RPM | |
| 34 | +| Command R7B | 128K | 4K | Text | 20 RPM | |
| 35 | +| Embed 4 | — | — | Embeddings (Text + Image) | 2,000 inputs/min | |
| 36 | +| Rerank 3.5 | — | — | Reranking | 10 RPM | |
| 37 | + |
| 38 | +### [Google Gemini](https://aistudio.google.com/app/apikey) 🇺🇸 |
| 39 | + |
| 40 | +Free tier unavailable in EU/UK/Switzerland. Free-tier prompts may be used by Google to improve products. [^1] |
| 41 | + |
| 42 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 43 | +|---|---|---|---|---| |
| 44 | +| Gemini 2.5 Flash | 1M | 65K | Text + Image + Audio + Video | 10 RPM, 250 RPD | |
| 45 | +| Gemini 2.5 Flash-Lite | 1M | 65K | Text + Image + Audio + Video | 15 RPM, 1,000 RPD | |
| 46 | + |
| 47 | +### [Mistral AI](https://console.mistral.ai/api-keys) 🇫🇷 |
| 48 | + |
| 49 | +Free "Experiment" plan, no credit card. ~1B tokens/month. |
| 50 | + |
| 51 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 52 | +|---|---|---|---|---| |
| 53 | +| Mistral Small 4 | 256K | 256K | Text + Image + Code | ~1 RPS, 500K TPM | |
| 54 | +| Mistral Medium 3 | 128K | 128K | Text | ~1 RPS, 500K TPM | |
| 55 | +| Mistral Large 3 | 256K | 256K | Text | ~1 RPS, 500K TPM | |
| 56 | +| Mistral Nemo (12B) | 128K | 128K | Text | ~1 RPS, 500K TPM | |
| 57 | +| Codestral | 256K | 256K | Code | ~1 RPS, 500K TPM | |
| 58 | +| Pixtral Large | 128K | 128K | Text + Image | ~1 RPS, 500K TPM | |
| 59 | + |
| 60 | +### [Z AI (Zhipu AI)](https://open.bigmodel.cn/usercenter/apikeys) 🇨🇳 |
| 61 | + |
| 62 | +Permanent free models, no credit card required. |
| 63 | + |
| 64 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 65 | +|---|---|---|---|---| |
| 66 | +| GLM-4.7-Flash | 200K | 128K | Text | 1 concurrent request | |
| 67 | +| GLM-4.5-Flash | 128K | ~8K | Text | 1 concurrent request | |
| 68 | +| GLM-4.6V-Flash | 128K | ~4K | Text + Image | 1 concurrent request | |
29 | 69 |
|
30 | 70 | ## Inference providers |
31 | 71 |
|
32 | 72 | Third-party platforms that host open-weight models from various sources. |
33 | 73 |
|
34 | | -- [Cerebras](https://cloud.cerebras.ai/) 🇺🇸 - Llama 3.3 70B, Qwen3 235B, GPT-OSS-120B +3 more. 30 RPM, 14,400 RPD. |
35 | | -- [Cloudflare Workers AI](https://dash.cloudflare.com/profile/api-tokens) 🇺🇸 - Llama 3.3 70B, Qwen QwQ 32B +47 more. 10K neurons/day. |
36 | | -- [GitHub Models](https://github.com/marketplace/models) 🇺🇸 - GPT-4o, Llama 3.3 70B, DeepSeek-R1 +more. 10-15 RPM, 50-150 RPD. |
37 | | -- [Groq](https://console.groq.com/keys) 🇺🇸 - Llama 3.3 70B, Llama 4 Scout, Kimi K2 +17 more. 30 RPM, 1K RPD (14,400 for Llama 3.1 8B). [^3] |
38 | | -- [Hugging Face](https://huggingface.co/settings/tokens) 🇺🇸 - Llama 3.3 70B, Qwen2.5 72B, Mistral 7B +many more. $0.10/mo in free credits. |
39 | | -- [Kluster AI](https://platform.kluster.ai/apikeys) 🇺🇸 - DeepSeek-R1, Llama 4 Maverick, Qwen3-235B +2 more. Limits undocumented. |
40 | | -- [LLM7.io](https://token.llm7.io) 🇬🇧 - DeepSeek R1, Flash-Lite, Qwen2.5 Coder +27 more. 30 RPM (120 with token). |
41 | | -- [NVIDIA NIM](https://build.nvidia.com/explore/discover) 🇺🇸 - Llama 3.3 70B, Mistral Large, Qwen3 235B +more. 40 RPM. |
42 | | -- [Ollama Cloud](https://ollama.com/settings/keys) 🇺🇸 - DeepSeek-V3.2, Qwen3.5, Kimi-K2.5 +17 more. 1 concurrent model, light usage. [^2] |
43 | | -- [OpenRouter](https://openrouter.ai/keys) 🇺🇸 - DeepSeek R1, Llama 3.3 70B, GPT-OSS-120B +29 more. 20 RPM, 50 RPD (1K with $10+ in purchased credits). [^4] |
44 | | -- [SiliconFlow](https://cloud.siliconflow.cn/account/ak) 🇨🇳 - Qwen3-8B, DeepSeek-R1-Distill-Qwen-7B, GLM-4.1V-9B-Thinking +10 more. 1K RPM, 50K TPM. |
| 74 | +### [Cerebras](https://cloud.cerebras.ai/) 🇺🇸 |
| 75 | + |
| 76 | +Free tier, no credit card. Ultra-fast inference (~2,600 tok/s). 1M tokens/day cap. |
| 77 | + |
| 78 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 79 | +|---|---|---|---|---| |
| 80 | +| llama3.1-8b | 128K (8K on free) | 8K | Text | 30 RPM, 14,400 RPD, 1M TPD | |
| 81 | +| gpt-oss-120b | 128K (8K on free) | 8K | Text | 30 RPM, 14,400 RPD, 1M TPD | |
| 82 | +| qwen-3-235b-a22b-instruct-2507 | 131K (8K on free) | 8K | Text | 30 RPM, 14,400 RPD, 1M TPD | |
| 83 | +| zai-glm-4.7 | 128K (8K on free) | 8K | Text | 10 RPM, 100 RPD, 1M TPD | |
| 84 | + |
| 85 | +### [Cloudflare Workers AI](https://dash.cloudflare.com/profile/api-tokens) 🇺🇸 |
| 86 | + |
| 87 | +10,000 Neurons/day free. 50+ models available on free tier. |
| 88 | + |
| 89 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 90 | +|---|---|---|---|---| |
| 91 | +| @cf/meta/llama-3.3-70b-instruct-fp8-fast | 131K | Shared w/ context | Text | 10K neurons/day (shared) | |
| 92 | +| @cf/meta/llama-3.1-8b-instruct-fp8-fast | 131K | Shared w/ context | Text | 10K neurons/day (shared) | |
| 93 | +| @cf/meta/llama-3.2-11b-vision-instruct | 131K | Shared w/ context | Text + Vision | 10K neurons/day (shared) | |
| 94 | +| @cf/meta/llama-4-scout-17b-16e-instruct | Up to 10M | Shared w/ context | Multimodal | 10K neurons/day (shared) | |
| 95 | +| @cf/mistralai/mistral-small-3.1-24b-instruct | 128K | Shared w/ context | Text | 10K neurons/day (shared) | |
| 96 | +| @cf/google/gemma-4-26b-a4b-it | 256K | Shared w/ context | Text | 10K neurons/day (shared) | |
| 97 | +| @cf/qwen/qwq-32b | 32K | Shared w/ context | Text | 10K neurons/day (shared) | |
| 98 | +| @cf/deepseek-ai/deepseek-r1-distill-qwen-32b | 32K | Shared w/ context | Text | 10K neurons/day (shared) | |
| 99 | +| + 42 more models | Varies | Varies | Text, Image, Audio, Embeddings | 10K neurons/day (shared) | |
| 100 | + |
| 101 | +### [GitHub Models](https://github.com/marketplace/models) 🇺🇸 |
| 102 | + |
| 103 | +Free prototyping for all GitHub users. 45+ models. Per-request limits (8K in / 4K out). |
| 104 | + |
| 105 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 106 | +|---|---|---|---|---| |
| 107 | +| gpt-4.1 | 1M | 32K | Text | 10 RPM, 50 RPD | |
| 108 | +| gpt-4.1-mini | 1M | 32K | Text | 15 RPM, 150 RPD | |
| 109 | +| gpt-4o | 128K | 16K | Text + Vision | 10 RPM, 50 RPD | |
| 110 | +| o3-mini | 200K | 100K | Text (reasoning) | 10 RPM, 50 RPD | |
| 111 | +| o4-mini | 200K | 100K | Text (reasoning) | 10 RPM, 50 RPD | |
| 112 | +| Llama-4-Scout-17B-16E | 512K | ~4K | Text + Vision | 15 RPM, 150 RPD | |
| 113 | +| Llama-4-Maverick-17B-128E | 256K | ~4K | Text + Vision | 10 RPM, 50 RPD | |
| 114 | +| Meta-Llama-3.3-70B | 131K | ~4K | Text | 15 RPM, 150 RPD | |
| 115 | +| DeepSeek-R1 | 64K | 8K | Text (reasoning) | 15 RPM, 150 RPD | |
| 116 | +| Mistral-Small-3.1 | 128K | ~4K | Text + Vision | 15 RPM, 150 RPD | |
| 117 | +| + 35 more models | Varies | Varies | Text / Image | Varies by tier | |
| 118 | + |
| 119 | +### [Groq](https://console.groq.com/keys) 🇺🇸 |
| 120 | + |
| 121 | +Free tier, no credit card. Ultra-fast LPU inference. [^2] |
| 122 | + |
| 123 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 124 | +|---|---|---|---|---| |
| 125 | +| llama-3.3-70b-versatile | 131K | 32K | Text | 30 RPM, 14,400 RPD | |
| 126 | +| llama-3.1-8b-instant | 131K | 131K | Text | 30 RPM, 14,400 RPD | |
| 127 | +| llama-4-scout-17b-16e-instruct | 131K | 8K | Text + Vision | 30 RPM, 14,400 RPD | |
| 128 | +| llama-4-maverick-17b-128e-instruct | 131K | 8K | Text + Vision | 15 RPM, 500 RPD | |
| 129 | +| qwen3-32b | 131K | 131K | Text | 30 RPM, 14,400 RPD | |
| 130 | +| gpt-oss-120b | 131K | 32K | Text | 30 RPM, 14,400 RPD | |
| 131 | +| kimi-k2-instruct | 262K | 262K | Text | 30 RPM, 14,400 RPD | |
| 132 | +| deepseek-r1-distill-70b | 131K | 8K | Text | 30 RPM, 14,400 RPD | |
| 133 | +| whisper-large-v3 | — | — | Audio → Text | 20 RPM, 2,000 RPD | |
| 134 | +| whisper-large-v3-turbo | — | — | Audio → Text | 20 RPM, 2,000 RPD | |
| 135 | + |
| 136 | +### [Hugging Face](https://huggingface.co/settings/tokens) 🇺🇸 |
| 137 | + |
| 138 | +Free Serverless Inference API + ~$0.10/month free credits. Thousands of models. |
| 139 | + |
| 140 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 141 | +|---|---|---|---|---| |
| 142 | +| Meta-Llama-3.1-8B-Instruct | 128K | ~4K | Text | ~1,000 RPD | |
| 143 | +| Mistral-7B-Instruct-v0.3 | 32K | ~4K | Text | ~1,000 RPD | |
| 144 | +| Mixtral-8x7B-Instruct-v0.1 | 32K | ~4K | Text | ~1,000 RPD | |
| 145 | +| Phi-3.5-mini-instruct | 128K | ~4K | Text | ~1,000 RPD | |
| 146 | +| Qwen2.5-7B-Instruct | 131K | ~4K | Text | ~1,000 RPD | |
| 147 | +| + thousands of community models | Varies | Varies | Text, Image, Audio, Embeddings | ~$0.10/month free credits | |
| 148 | + |
| 149 | +### [Kilo Code](https://kilo.ai) 🇺🇸 |
| 150 | + |
| 151 | +Free models with no credit card required. ~29 rotating free models. Base URL: `https://api.kilo.ai/api/gateway`. [^5] |
| 152 | + |
| 153 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 154 | +|---|---|---|---|---| |
| 155 | +| qwen/qwen3.6-plus-preview:free | 1M | 32K | Text + Image + Video | ~200 req/hr | |
| 156 | +| nvidia/nemotron-3-super-120b-a12b:free | 262K | 32K | Text | ~200 req/hr | |
| 157 | +| stepfun/step-3.5-flash:free | 256K | 65K | Text | ~200 req/hr | |
| 158 | +| corethink:free | ~78K | ~8K | Text | ~200 req/hr | |
| 159 | +| minimax/minimax-m2.5:free | 196K | 196K | Text | ~200 req/hr | |
| 160 | +| arcee-ai/trinity-large-preview:free | 128K | ~32K | Text | ~200 req/hr | |
| 161 | +| kwaipilot/kat-coder-pro:free | 262K | 128K | Text (code) | ~200 req/hr | |
| 162 | +| qwen/qwen3-coder:free | 262K | 262K | Text (code) | ~200 req/hr | |
| 163 | +| google/gemma-4-31b-it:free | 262K | — | Multimodal | ~200 req/hr | |
| 164 | +| deepseek/deepseek-r1-0528:free | — | — | Text (reasoning) | ~200 req/hr | |
| 165 | +| meta-llama/llama-3.3-70b-instruct:free | — | — | Text | ~200 req/hr | |
| 166 | +| + ~18 more rotating free models | Varies | Varies | Text / Image | ~200 req/hr | |
| 167 | + |
| 168 | +### [LLM7.io](https://token.llm7.io) 🇬🇧 |
| 169 | + |
| 170 | +Zero-friction API gateway. No registration needed for basic access. 30+ models. |
| 171 | + |
| 172 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 173 | +|---|---|---|---|---| |
| 174 | +| deepseek-r1-0528 | — | — | Text (reasoning) | 30 RPM (120 with token) | |
| 175 | +| deepseek-v3-0324 | — | — | Text | 30 RPM (120 with token) | |
| 176 | +| gemini-2.5-flash-lite | — | — | Text + Vision | 30 RPM (120 with token) | |
| 177 | +| gpt-4o-mini | — | — | Text + Vision | 30 RPM (120 with token) | |
| 178 | +| mistral-small-3.1-24b | 32K | — | Text | 30 RPM (120 with token) | |
| 179 | +| qwen2.5-coder-32b | — | — | Text (code) | 30 RPM (120 with token) | |
| 180 | +| + ~24 more models | Varies | Varies | Text | 30 RPM (120 with token) | |
| 181 | + |
| 182 | +### [NVIDIA NIM](https://build.nvidia.com/explore/discover) 🇺🇸 |
| 183 | + |
| 184 | +Free with NVIDIA Developer Program membership. 100+ models. No daily token cap. |
| 185 | + |
| 186 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 187 | +|---|---|---|---|---| |
| 188 | +| deepseek-ai/deepseek-r1 | 128K | ~163K | Text (reasoning) | ~40 RPM | |
| 189 | +| nvidia/llama-3.1-nemotron-ultra-253b-v1 | 128K | 4K | Text | ~40 RPM | |
| 190 | +| nvidia/nemotron-3-super-120b-a12b | 262K | 262K | Text | ~40 RPM | |
| 191 | +| nvidia/nemotron-3-nano-30b-a3b | 128K | 32K | Text | ~40 RPM | |
| 192 | +| meta/llama-3.1-405b-instruct | 128K | 4K | Text | ~40 RPM | |
| 193 | +| qwen/qwen2.5-72b-instruct | 128K | 8K | Text | ~40 RPM | |
| 194 | +| google/gemma-4-31b | 128K | 8K | Text | ~40 RPM | |
| 195 | +| mistralai/mistral-large-2-instruct | 128K | 4K | Text | ~40 RPM | |
| 196 | +| nvidia/nemotron-nano-2-vl | 128K | 8K | Vision + Text + Video | ~40 RPM | |
| 197 | +| minimax/minimax-m2.7 | 128K | 8K | Text | ~40 RPM | |
| 198 | +| + 90 more models | Varies | Varies | Text, Image, Video, Speech, Embeddings | ~40 RPM | |
| 199 | + |
| 200 | +### [Ollama Cloud](https://ollama.com/settings/keys) 🇺🇸 |
| 201 | + |
| 202 | +Free tier with qualitative usage limits. 400+ models from Ollama library. Not OpenAI SDK-compatible; uses [Ollama API](https://docs.ollama.com/cloud). [^3] |
| 203 | + |
| 204 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 205 | +|---|---|---|---|---| |
| 206 | +| llama3.1:cloud | 128K | Model-dependent | Text | Session/weekly limits (unpublished) | |
| 207 | +| deepseek-r1:cloud | 128K | Model-dependent | Text (reasoning) | Session/weekly limits (unpublished) | |
| 208 | +| qwen2.5:cloud | 128K | Model-dependent | Text | Session/weekly limits (unpublished) | |
| 209 | +| gemma2:cloud | 8K | Model-dependent | Text | Session/weekly limits (unpublished) | |
| 210 | +| mistral:cloud | 32K | Model-dependent | Text | Session/weekly limits (unpublished) | |
| 211 | +| + 400 more models | Varies | Varies | Text | Session/weekly limits (unpublished) | |
| 212 | + |
| 213 | +### [OpenRouter](https://openrouter.ai/keys) 🇺🇸 |
| 214 | + |
| 215 | +35+ free models (marked with `:free` suffix). OpenAI SDK-compatible. [^4] |
| 216 | + |
| 217 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 218 | +|---|---|---|---|---| |
| 219 | +| deepseek/deepseek-r1-0528:free | 163K | ~163K | Text (reasoning) | 20 RPM, 200 RPD | |
| 220 | +| deepseek/deepseek-chat-v3-0324:free | 163K | 163K | Text | 20 RPM, 200 RPD | |
| 221 | +| qwen/qwen3.6-plus:free | 1M | 65K | Text | 20 RPM, 200 RPD | |
| 222 | +| qwen/qwen3-coder-480b-a35b:free | 262K | ~32K | Text | 20 RPM, 200 RPD | |
| 223 | +| meta-llama/llama-4-scout:free | 10M | 16K | Multimodal | 20 RPM, 200 RPD | |
| 224 | +| meta-llama/llama-4-maverick:free | 1M | 16K | Multimodal | 20 RPM, 200 RPD | |
| 225 | +| meta-llama/llama-3.3-70b-instruct:free | 65K | ~16K | Text | 20 RPM, 200 RPD | |
| 226 | +| google/gemma-4-31b-it:free | 256K | ~8K | Multimodal | 20 RPM, 200 RPD | |
| 227 | +| nvidia/nemotron-3-super-120b-a12b:free | 1M | ~32K | Text | 20 RPM, 200 RPD | |
| 228 | +| openai/gpt-oss-120b:free | 131K | 131K | Text | 20 RPM, 200 RPD | |
| 229 | +| minimax/minimax-m2.5:free | 196K | 8K | Text | 20 RPM, 200 RPD | |
| 230 | +| mistralai/devstral-2512:free | 256K | ~32K | Text | 20 RPM, 200 RPD | |
| 231 | +| + ~23 more free models | Varies | Varies | Text / Image | 20 RPM, 200 RPD | |
| 232 | + |
| 233 | +### [SiliconFlow](https://cloud.siliconflow.cn/account/ak) 🇨🇳 |
| 234 | + |
| 235 | +Free tier with 14 CNY signup credits. Permanently free models available. |
| 236 | + |
| 237 | +| Model Name | Context | Max Output | Modality | Rate Limit | |
| 238 | +|---|---|---|---|---| |
| 239 | +| Qwen/Qwen3-8B | 131K | 131K | Text | 1,000 RPM, 50K TPM | |
| 240 | +| deepseek-ai/DeepSeek-R1-0528-Qwen3-8B | ~33K | 16K | Text (reasoning) | 1,000 RPM, 50K TPM | |
| 241 | +| deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 131K | Configurable | Text (reasoning) | 1,000 RPM, 50K TPM | |
| 242 | +| THUDM/glm-4-9b-chat | 32K | 32K | Text | 1,000 RPM, 50K TPM | |
| 243 | +| THUDM/GLM-4.1V-9B-Thinking | 66K | 66K | Vision + Text | 1,000 RPM, 50K TPM | |
| 244 | +| deepseek-ai/DeepSeek-OCR | — | 8K | Vision (OCR) | 1,000 RPM, 50K TPM | |
| 245 | +| + embedding/speech models | Varies | Varies | Embeddings, Speech | 1,000 RPM, 50K TPM | |
45 | 246 |
|
46 | 247 | ## Contributing |
47 | 248 |
|
48 | 249 | Know a free tier that's missing? [Open a PR](contributing.md). Include the provider, endpoint, rate limits (link to their docs), and a few notable models. Trial credits and time-limited promos don't count. |
49 | 250 |
|
50 | | -## Footnotes |
| 251 | +## Glossary |
| 252 | + |
| 253 | +| Abbreviation | Meaning | |
| 254 | +|---|---| |
| 255 | +| **RPM** | Requests per minute | |
| 256 | +| **RPD** | Requests per day | |
| 257 | +| **TPM** | Tokens per minute | |
| 258 | +| **TPD** | Tokens per day | |
| 259 | +| **RPS** | Requests per second | |
| 260 | + |
| 261 | +## Notes |
51 | 262 |
|
52 | | -- **RPM** -- requests per minute. **RPD** -- requests per day. |
53 | | -- "Limits undocumented" means the provider doesn't publish their rate limits. |
54 | 263 | - All endpoints are OpenAI SDK-compatible unless noted. |
55 | 264 | - Each link points to the provider's API key page. |
56 | | -- [^1]: Free tier not available in the EU, UK, or Switzerland ([available regions](https://ai.google.dev/gemini-api/docs/available-regions)). |
57 | | -- [^2]: Ollama Cloud measures usage by GPU time, not tokens or requests. Free tier described as "light usage" with session limits resetting every 5 hours and weekly limits every 7 days. Pro (50x more) and Max (250x more) plans available. Not OpenAI SDK-compatible; uses [Ollama API](https://docs.ollama.com/cloud). |
58 | | -- [^3]: 14,400 RPD only applies to Llama 3.1 8B Instant. Most other models (Llama 3.3 70B, Llama 4 Scout, Kimi K2, etc.) are limited to 1,000 RPD ([rate limits](https://console.groq.com/docs/rate-limits)). |
59 | | -- [^4]: Free models default to 50 RPD. A one-time purchase of $10+ in credits unlocks 1,000 RPD for free models. OpenRouter also offers a [Free Models Router](https://openrouter.ai/docs/guides/routing/routers/free-models-router) (`openrouter/free`) and [model fallbacks](https://openrouter.ai/docs/guides/routing/model-fallbacks) for chaining models in priority order. |
| 265 | + |
| 266 | +[^1]: Free tier not available in the EU, UK, or Switzerland ([available regions](https://ai.google.dev/gemini-api/docs/available-regions)). |
| 267 | +[^2]: Groq rate limits vary by model. Llama 4 Maverick is limited to 500 RPD. Most other models get 14,400 RPD ([rate limits](https://console.groq.com/docs/rate-limits)). |
| 268 | +[^3]: Ollama Cloud measures usage by GPU time, not tokens or requests. Free tier described as "light usage" with session limits resetting every 5 hours and weekly limits every 7 days. Pro (50x more) and Max (250x more) plans available. Not OpenAI SDK-compatible; uses [Ollama API](https://docs.ollama.com/cloud). |
| 269 | +[^4]: Free models default to 200 RPD. A one-time purchase of $10+ in credits unlocks 1,000 RPD for free models. OpenRouter also offers a [Free Models Router](https://openrouter.ai/docs/guides/routing/routers/free-models-router) (`openrouter/free`) and [model fallbacks](https://openrouter.ai/docs/guides/routing/model-fallbacks) for chaining models in priority order. |
| 270 | +[^5]: Kilo Code free model list is dynamic and rotates based on partnerships. Free models log prompts for improvement. Auto-router available: `kilo-auto/free`. |
0 commit comments