Skip to content

Commit c77a4cb

Browse files
bene2k1fpagny
andauthored
Apply suggestions from code review
Co-authored-by: fpagny <franckpagny@hotmail.fr>
1 parent bc707ac commit c77a4cb

File tree

1 file changed

+32
-46
lines changed

1 file changed

+32
-46
lines changed

pages/generative-apis/how-to/use-batch-processing.mdx

Lines changed: 32 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -21,26 +21,26 @@ Batch processing is designed for:
2121
<Requirements />
2222

2323
- A Scaleway account logged into the [console](https://console.scaleway.com)
24-
- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization
24+
- [Owner](/iam/concepts/#owner) status or [IAM permissions](/iam/concepts/#permission) allowing you to perform actions in the intended Organization. These IAM permissions are `GenerativeApisFullAccess`, `ObjectStorageObjectsRead`,`ObjectStorageObjectsWrite`.
2525
- A valid [API key](/iam/how-to/create-api-keys/) for API authentication
2626
- Python 3.7+ installed on your system
2727
- If bucket policies exists in the bucket storing your `.jsonl` file, the following bucket policies actions `s3:GetObject` and `s3:PutObject` for `scw-managed-genapi-batch` IAM application. This application is auto-generated by Scaleway after your first batch creation.
2828

2929
## How batch processing works
3030

31-
1. Upload a JSONL input file to an Object Storage bucket.
31+
1. Upload a JSONL input file to an Object Storage bucket. This file contains all API queries to perform.
3232
2. Create a batch job referencing this file.
3333
3. The service processes each line as an individual request.
34-
4. The results are written to an output file in the same bucket, and named `{filename}-output.jsonl` and `{filename}-error.jsonl`
34+
4. The results are written to an output file in the same bucket, and named `{filename}_output.jsonl` and `{filename}_error.jsonl`
3535
5. Retrieve the output file once the job status is `completed`.
3636

3737
Each line of the input file must be a valid JSON object representing a request to the Generative API.
3838

3939
Example `input.jsonl`:
4040

4141
```json
42-
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "voxtral-small-24b-2507", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Translate this into French: Hello world!"}],"max_completion_tokens": 500}}
43-
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "voxtral-small-24b-2507", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Write a poem about the ocean."}],"max_completion_tokens": 500}}
42+
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "mistral-small-3.2-24b-instruct-2506", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Translate this into French: Hello world!"}],"max_completion_tokens": 500}}
43+
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "mistral-small-3.2-24b-instruct-2506", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Write a poem about the ocean."}],"max_completion_tokens": 500}}
4444
```
4545

4646
<Message type="note">
@@ -51,7 +51,7 @@ Example `input.jsonl`:
5151

5252
Batch processing relies on an Object Storage bucket. You can use any of your existing Object Storage buckets for batch processing or create a new one.
5353

54-
To use your bucket with batch procession, you must configure a bucket policy allowing the Generative APIs application principal to:
54+
If your bucket has at least one [bucket policy](object-storage/api-cli/bucket-policy/) configured, you must edit this bucket policy or create a new one allowing the Scaleway-managed application principal (named `scw-managed-genapi-batch`) to perform the following actions:
5555

5656
* `s3:GetObject`
5757
* `s3:PutObject`
@@ -67,7 +67,7 @@ Below is an example bucket policy:
6767
"Sid": "scw-managed",
6868
"Effect": "Allow",
6969
"Principal": {
70-
"SCW": "application_id:c31bd445-9017-4096-adea-b6e03b44a99d"
70+
"SCW": "application_id:example-4096-adea-b6e03b44a99d"
7171
},
7272
"Action": [
7373
"s3:GetObject",
@@ -79,7 +79,7 @@ Below is an example bucket policy:
7979
"Sid": "Scaleway secure statement",
8080
"Effect": "Allow",
8181
"Principal": {
82-
"SCW": "user_id:8c0ef0eb-9f9c-46b1-b2e0-ed883a6c54ce"
82+
"SCW": "user_id:example-9f9c-46b1-b2e0-ed883a6c54ce"
8383
},
8484
"Action": "*",
8585
"Resource": [
@@ -90,7 +90,9 @@ Below is an example bucket policy:
9090
]
9191
}
9292
```
93-
93+
Note that to configure this bucket policy, you need to replace:
94+
- `user_id:YOUR_IAM_USER_ID` with your IAM user id
95+
- `application_id:SCW_MANAGED_GENAPI_BATCH_ID` with IAM application id from the application named `scw-managed-genapi-batch`. Note that this application (and its corresponding id) will be created automatically in your organization after you initiate a first batch query. If you have bucket policies configured, this means you need to create a first batch to obtain `scw-managed-genapi-batch` IAM id (even if this first batch fails).
9496
<Message type="tip">
9597
The `Scaleway secure statement` ensures that you retain full access to your bucket and do not accidentally lock yourself out.
9698
</Message>
@@ -141,8 +143,8 @@ Each line of the output file corresponds to one input request.
141143
Example:
142144

143145
```json
144-
{"id":"93807e44-09fa-4a9f-baa3-574164651535","custom_id":"request-1","error":null,"response":{"request_id":"7a38ba8c-0fad-4923-9214-9b218a416b50","status_code":200,"body":{"id":"chatcmpl-7a38ba8c-0fad-4923-9214-9b218a416b50","object":"chat.completion","created":1771335694,"model":"voxtral-small-24b-2507","choices":[{"index":0,"message":{"role":"assistant","content":"Bonjour le monde!","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":19,"total_tokens":25,"completion_tokens":6,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}}}
145-
{"id":"93807e44-09fa-4a9f-baa3-574164651535","custom_id":"request-2","error":null,"response":{"request_id":"01c2cc2c-b072-45fb-9031-2b10b75f1b2f","status_code":200,"body":{"id":"chatcmpl-01c2cc2c-b072-45fb-9031-2b10b75f1b2f","object":"chat.completion","created":1771335694,"model":"voxtral-small-24b-2507","choices":[{"index":0,"message":{"role":"assistant","content":"Oh, the ocean, so vast and so wide,\nA shimmering expanse of blue tide.\nIt whispers and roars, in a rhythm so grand,\nA symphony of waves, in the ocean's band.\n\nIt's a mystery, deep and profound,\nWith secrets untold, in its watery ground.\nCreatures unseen, in its depths they reside,\nIn the ocean's embrace, they confide.\n\nIt's a force to be reckoned with, a power so great,\nA tempestuous beast, that can't be tamed, can't be sate.\nYet, it's also a cradle, a gentle, soothing song,\nA lullaby sung, to the weary and strong.\n\nSo here's to the ocean, in all its might and grace,\nA wonder of nature, in its own special place.\nMay it continue to inspire, to captivate and to amaze,\nThe ocean, the ocean, in all its ways.","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":20,"total_tokens":218,"completion_tokens":198,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}}}
146+
{"id":"93807e44-09fa-4a9f-baa3-574164651535","custom_id":"request-1","error":null,"response":{"request_id":"7a38ba8c-0fad-4923-9214-9b218a416b50","status_code":200,"body":{"id":"chatcmpl-7a38ba8c-0fad-4923-9214-9b218a416b50","object":"chat.completion","created":1771335694,"model":"mistral-small-3.2-24b-instruct-2506","choices":[{"index":0,"message":{"role":"assistant","content":"Bonjour le monde!","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":19,"total_tokens":25,"completion_tokens":6,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}}}
147+
{"id":"93807e44-09fa-4a9f-baa3-574164651535","custom_id":"request-2","error":null,"response":{"request_id":"01c2cc2c-b072-45fb-9031-2b10b75f1b2f","status_code":200,"body":{"id":"chatcmpl-01c2cc2c-b072-45fb-9031-2b10b75f1b2f","object":"chat.completion","created":1771335694,"model":"mistral-small-3.2-24b-instruct-2506","choices":[{"index":0,"message":{"role":"assistant","content":"Oh, the ocean, so vast and so wide,\nA shimmering expanse of blue tide.\nIt whispers and roars, in a rhythm so grand,\nA symphony of waves, in the ocean's band.\n\nIt's a mystery, deep and profound,\nWith secrets untold, in its watery ground.\nCreatures unseen, in its depths they reside,\nIn the ocean's embrace, they confide.\n\nIt's a force to be reckoned with, a power so great,\nA tempestuous beast, that can't be tamed, can't be sate.\nYet, it's also a cradle, a gentle, soothing song,\nA lullaby sung, to the weary and strong.\n\nSo here's to the ocean, in all its might and grace,\nA wonder of nature, in its own special place.\nMay it continue to inspire, to captivate and to amaze,\nThe ocean, the ocean, in all its ways.","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":20,"total_tokens":218,"completion_tokens":198,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}}}
146148
```
147149

148150
## Creating a batch using the OpenAI Python SDK
@@ -155,21 +157,14 @@ import time
155157
import logging
156158
from openai import OpenAI
157159

158-
# Enable debug logging to see HTTP requests
159-
logging.basicConfig(
160-
format="%(levelname)s [%(asctime)s] %(name)s - %(message)s",
161-
datefmt="%Y-%m-%d %H:%M:%S",
162-
level=logging.INFO # Set to DEBUG if you want more detail
163-
)
164-
165160
# Initialize client with staging base URL
166161
client = OpenAI(
167162
api_key=os.getenv("SCW_SECRET_KEY"),
168163
base_url="https://api.scaleway.ai/v1" # No "/batches" needed
169164
)
170165

171-
# Your pre-signed or public S3 input file URL
172-
input_file_url = "https://<my-bucket>/input.jsonl"
166+
# Your S3 input file URL (only path to the file without query parameters, hence not a pre-signed URL)
167+
input_file_url = "https://<my-bucket-name>.s3.<region>.scw.cloud/input.jsonl"
173168

174169
# Submit the batch job
175170
print("Submitting batch job...")
@@ -183,40 +178,31 @@ print(f"Batch created: {batch.id}")
183178
print(f"Initial status: {batch.status}")
184179

185180
# Polling function
186-
def wait_for_completion(batch_id, check_interval=30):
181+
def wait_for_completion(batch_id, check_interval=10):
187182
print(f"Polling every {check_interval}s for completion...\n")
188183

189184
while True:
190-
status = client.batches.retrieve(batch_id)
191-
print(f"Status: {status.status.upper()} (Updated: {status.completed_at or 'N/A'})")
192-
193-
if status.status == "completed":
194-
print(f"\n Batch completed successfully!")
195-
if hasattr(status, 'output_file_id') and status.output_file_id:
196-
print(f"Output File ID: {status.output_file_id}")
197-
198-
# Most important: get the download URL
199-
if hasattr(status, 'output_file_url') and status.output_file_url:
200-
print(f"Download results: {status.output_file_url}")
201-
return status.output_file_url
185+
batch_job = client.batches.retrieve(batch_id)
186+
last_update_timestamp = batch_job.completed_at or batch_job.in_progress_at or batch_job.created_at
187+
print(f"Status: {batch_job.status.upper()} (Updated: {time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(last_update_timestamp))})")
188+
189+
if batch_job.status == "completed":
190+
print(f"\nBatch completed successfully!")
191+
print(f"Download results: {batch_job.output_file_id}")
192+
return batch_job.output_file_id
193+
194+
elif batch_job.status == "failed":
195+
if hasattr(batch_job, 'errors') and batch_job.errors:
196+
print(f"Batch failed: {batch_job.errors}")
202197
else:
203-
print("Error! No download link provided.")
204-
return None
205-
206-
elif status.status == "failed":
207-
print(f"Batch failed: {getattr(status, 'error', 'Unknown error')}")
198+
print(f"Batch failed: Unknown error")
208199
return None
209200

210-
elif status.status in ["in_progress", "validating"]:
201+
elif batch_job.status in ["in_progress", "validating"]:
211202
time.sleep(check_interval)
212203
else:
213-
print(f"Unexpected status: {status.status}")
204+
print(f"Unexpected status: {batch_job.status}")
214205
return None
215206

216207
# Start polling
217-
download_url = wait_for_completion(batch.id)
218-
219-
if download_url:
220-
print(f"\nDone! You can download the results using:")
221-
print(download_url)
222-
```
208+
wait_for_completion(batch.id)

0 commit comments

Comments
 (0)