diff --git a/src/content/docs/workers-ai/features/batch-api/index.mdx b/src/content/docs/workers-ai/features/batch-api/index.mdx index 45ac9970b5d76d3..77044c1f827caae 100644 --- a/src/content/docs/workers-ai/features/batch-api/index.mdx +++ b/src/content/docs/workers-ai/features/batch-api/index.mdx @@ -33,9 +33,4 @@ This will create a repository in your GitHub account and deploy a ready-to-use W ## Supported Models -- [@cf/meta/llama-3.3-70b-instruct-fp8-fast](/workers-ai/models/llama-3.3-70b-instruct-fp8-fast/) -- [@cf/baai/bge-small-en-v1.5](/workers-ai/models/bge-small-en-v1.5/) -- [@cf/baai/bge-base-en-v1.5](/workers-ai/models/bge-base-en-v1.5/) -- [@cf/baai/bge-large-en-v1.5](/workers-ai/models/bge-large-en-v1.5/) -- [@cf/baai/bge-m3](/workers-ai/models/bge-m3/) -- [@cf/meta/m2m100-1.2b](/workers-ai/models/m2m100-1.2b/) +Refer to our [model catalog](/workers-ai/models/?capabilities=Batch) for supported models. diff --git a/src/content/docs/workers-ai/features/fine-tunes/loras.mdx b/src/content/docs/workers-ai/features/fine-tunes/loras.mdx index 518480bb2a9a5c3..546a2874f19211a 100644 --- a/src/content/docs/workers-ai/features/fine-tunes/loras.mdx +++ b/src/content/docs/workers-ai/features/fine-tunes/loras.mdx @@ -17,18 +17,7 @@ Workers AI supports fine-tuned inference with adapters trained with [Low-Rank Ad ## Limitations -- We only support LoRAs for the following models (must not be quantized): - - - `@cf/meta/llama-3.2-11b-vision-instruct` - - `@cf/meta/llama-3.3-70b-instruct-fp8-fast` - - `@cf/meta/llama-guard-3-8b` - - `@cf/meta/llama-3.1-8b-instruct-fast (soon)` - - `@cf/deepseek-ai/deepseek-r1-distill-qwen-32b` - - `@cf/qwen/qwen2.5-coder-32b-instruct` - - `@cf/qwen/qwq-32b` - - `@cf/mistralai/mistral-small-3.1-24b-instruct` - - `@cf/google/gemma-3-12b-it` - +- We only support LoRAs for a [variety of models](/workers-ai/models/?capabilities=LoRA) (must not be quantized) - Adapter must be trained with rank `r <=8` as well as larger ranks if up to 32. You can check the rank of a pre-trained LoRA adapter through the adapter's `config.json` file - LoRA adapter file must be < 300MB - LoRA adapter files must be named `adapter_config.json` and `adapter_model.safetensors` exactly