|
| 1 | +import Tabs from '@theme/Tabs'; |
| 2 | +import TabItem from '@theme/TabItem'; |
| 3 | + |
| 4 | +# Bedrock Batches |
| 5 | + |
| 6 | +Use Amazon Bedrock Batch Inference API through LiteLLM. |
| 7 | + |
| 8 | +| Property | Details | |
| 9 | +|----------|---------| |
| 10 | +| Description | Amazon Bedrock Batch Inference allows you to run inference on large datasets asynchronously | |
| 11 | +| Provider Doc | [AWS Bedrock Batch Inference ↗](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html) | |
| 12 | + |
| 13 | +## Overview |
| 14 | + |
| 15 | +Use this to: |
| 16 | + |
| 17 | +- Run batch inference on large datasets with Bedrock models |
| 18 | +- Control batch model access by key/user/team (same as chat completion models) |
| 19 | +- Manage S3 storage for batch input/output files |
| 20 | + |
| 21 | +## (Proxy Admin) Usage |
| 22 | + |
| 23 | +Here's how to give developers access to your Bedrock Batch models. |
| 24 | + |
| 25 | +### 1. Setup config.yaml |
| 26 | + |
| 27 | +- Specify `mode: batch` for each model: Allows developers to know this is a batch model |
| 28 | +- Configure S3 bucket and AWS credentials for batch operations |
| 29 | + |
| 30 | +```yaml showLineNumbers title="litellm_config.yaml" |
| 31 | +model_list: |
| 32 | + - model_name: "bedrock-batch-claude" |
| 33 | + litellm_params: |
| 34 | + model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0 |
| 35 | + ######################################################### |
| 36 | + ########## batch specific params ######################## |
| 37 | + s3_bucket_name: litellm-proxy |
| 38 | + s3_region_name: us-west-2 |
| 39 | + s3_access_key_id: os.environ/AWS_ACCESS_KEY_ID |
| 40 | + s3_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY |
| 41 | + aws_batch_role_arn: arn:aws:iam::888602223428:role/service-role/AmazonBedrockExecutionRoleForAgents_BB9HNW6V4CV |
| 42 | + model_info: |
| 43 | + mode: batch # 👈 SPECIFY MODE AS BATCH, to tell user this is a batch model |
| 44 | +``` |
| 45 | +
|
| 46 | +**Required Parameters:** |
| 47 | +
|
| 48 | +| Parameter | Description | |
| 49 | +|-----------|-------------| |
| 50 | +| `s3_bucket_name` | S3 bucket for batch input/output files | |
| 51 | +| `s3_region_name` | AWS region for S3 bucket | |
| 52 | +| `s3_access_key_id` | AWS access key for S3 bucket | |
| 53 | +| `s3_secret_access_key` | AWS secret key for S3 bucket | |
| 54 | +| `aws_batch_role_arn` | IAM role ARN for Bedrock batch operations. Bedrock Batch APIs require an IAM role ARN to be set. | |
| 55 | +| `mode: batch` | Indicates to LiteLLM this is a batch model | |
| 56 | + |
| 57 | +### 2. Create Virtual Key |
| 58 | + |
| 59 | +```bash showLineNumbers title="create_virtual_key.sh" |
| 60 | +curl -L -X POST 'https://{PROXY_BASE_URL}/key/generate' \ |
| 61 | +-H 'Authorization: Bearer ${PROXY_API_KEY}' \ |
| 62 | +-H 'Content-Type: application/json' \ |
| 63 | +-d '{"models": ["bedrock-batch-claude"]}' |
| 64 | +``` |
| 65 | + |
| 66 | +You can now use the virtual key to access the batch models (See Developer flow). |
| 67 | + |
| 68 | +## (Developer) Usage |
| 69 | + |
| 70 | +Here's how to create a LiteLLM managed file and execute Bedrock Batch CRUD operations with the file. |
| 71 | + |
| 72 | +### 1. Create request.jsonl |
| 73 | + |
| 74 | +- Check models available via `/model_group/info` |
| 75 | +- See all models with `mode: batch` |
| 76 | +- Set `model` in .jsonl to the model from `/model_group/info` |
| 77 | + |
| 78 | +```json showLineNumbers title="bedrock_batch_completions.jsonl" |
| 79 | +{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "bedrock-batch-claude", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello world!"}], "max_tokens": 1000}} |
| 80 | +{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "bedrock-batch-claude", "messages": [{"role": "system", "content": "You are an unhelpful assistant."}, {"role": "user", "content": "Hello world!"}], "max_tokens": 1000}} |
| 81 | +``` |
| 82 | + |
| 83 | +Expectation: |
| 84 | + |
| 85 | +- LiteLLM translates this to the bedrock deployment specific value (e.g. `bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0`) |
| 86 | + |
| 87 | +### 2. Upload File |
| 88 | + |
| 89 | +Specify `target_model_names: "<model-name>"` to enable LiteLLM managed files and request validation. |
| 90 | +
|
| 91 | +model-name should be the same as the model-name in the request.jsonl |
| 92 | +
|
| 93 | +<Tabs> |
| 94 | +<TabItem value="python" label="Python"> |
| 95 | +
|
| 96 | +```python showLineNumbers title="bedrock_batch.py" |
| 97 | +from openai import OpenAI |
| 98 | + |
| 99 | +client = OpenAI( |
| 100 | + base_url="http://0.0.0.0:4000", |
| 101 | + api_key="sk-1234", |
| 102 | +) |
| 103 | + |
| 104 | +# Upload file |
| 105 | +batch_input_file = client.files.create( |
| 106 | + file=open("./bedrock_batch_completions.jsonl", "rb"), # {"model": "bedrock-batch-claude"} <-> {"model": "bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0"} |
| 107 | + purpose="batch", |
| 108 | + extra_body={"target_model_names": "bedrock-batch-claude"} |
| 109 | +) |
| 110 | +print(batch_input_file) |
| 111 | +``` |
| 112 | + |
| 113 | +</TabItem> |
| 114 | +<TabItem value="curl" label="Curl"> |
| 115 | + |
| 116 | +```bash showLineNumbers title="Upload File" |
| 117 | +curl http://localhost:4000/v1/files \ |
| 118 | + -H "Authorization: Bearer sk-1234" \ |
| 119 | + -F purpose="batch" \ |
| 120 | + -F file="@bedrock_batch_completions.jsonl" \ |
| 121 | + -F extra_body='{"target_model_names": "bedrock-batch-claude"}' |
| 122 | +``` |
| 123 | + |
| 124 | +</TabItem> |
| 125 | +</Tabs> |
| 126 | + |
| 127 | +**Where is the file written?**: |
| 128 | + |
| 129 | +The file is written to S3 bucket specified in your config and prepared for Bedrock batch inference. |
| 130 | + |
| 131 | +### 3. Create the batch |
| 132 | + |
| 133 | +<Tabs> |
| 134 | +<TabItem value="python" label="Python"> |
| 135 | + |
| 136 | +```python showLineNumbers title="bedrock_batch.py" |
| 137 | +... |
| 138 | +# Create batch |
| 139 | +batch = client.batches.create( |
| 140 | + input_file_id=batch_input_file.id, |
| 141 | + endpoint="/v1/chat/completions", |
| 142 | + completion_window="24h", |
| 143 | + metadata={"description": "Test batch job"}, |
| 144 | +) |
| 145 | +print(batch) |
| 146 | +``` |
| 147 | + |
| 148 | +</TabItem> |
| 149 | +<TabItem value="curl" label="Curl"> |
| 150 | + |
| 151 | +```bash showLineNumbers title="Create Batch Request" |
| 152 | +curl http://localhost:4000/v1/batches \ |
| 153 | + -H "Authorization: Bearer sk-1234" \ |
| 154 | + -H "Content-Type: application/json" \ |
| 155 | + -d '{ |
| 156 | + "input_file_id": "file-abc123", |
| 157 | + "endpoint": "/v1/chat/completions", |
| 158 | + "completion_window": "24h", |
| 159 | + "metadata": {"description": "Test batch job"} |
| 160 | + }' |
| 161 | +``` |
| 162 | + |
| 163 | +</TabItem> |
| 164 | +</Tabs> |
| 165 | + |
| 166 | +## FAQ |
| 167 | + |
| 168 | +### Where are my files written? |
| 169 | + |
| 170 | +When a `target_model_names` is specified, the file is written to the S3 bucket configured in your Bedrock batch model configuration. |
| 171 | + |
| 172 | +### What models are supported? |
| 173 | + |
| 174 | +LiteLLM only supports Bedrock Anthropic Models for Batch API. If you want other bedrock models file an issue [here](https://github.com/BerriAI/litellm/issues/new/choose). |
| 175 | + |
| 176 | +## Further Reading |
| 177 | + |
| 178 | +- [AWS Bedrock Batch Inference Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html) |
| 179 | +- [LiteLLM Managed Batches](../proxy/managed_batches) |
| 180 | +- [LiteLLM Authentication to Bedrock](https://docs.litellm.ai/docs/providers/bedrock#boto3---authentication) |
0 commit comments