Client Setup
All providers use the OpenAI SDK with provider-specific headers. Choose your provider to get started:Input File Format
Create a JSONL file with one JSON object per line. Each line represents a single request:custom_id values. See provider notes below for any provider-specific limits (e.g. minimum records, URL/body format).
Before using AWS Bedrock batch processing, ensure you have:
- S3 Bucket: For storing input and output files
- IAM Execution Role: With permissions for S3 access and Bedrock model invocation
- User Permissions: Including
iam:PassRoleto pass the execution role to Bedrock
Before using Vertex AI batch processing, ensure you have:
- Cloud Storage Bucket: The bucket must be in the same region as the model, and the service account must have read/write access to the bucket
- Batch Prediction Permissions: The service account must have permission to create batch prediction jobs
Before using Azure OpenAI batch processing, ensure you have:
- Deployment type: A model deployed as Global Batch or Data Zone Batch (standard/online deployments cannot be used for batch).
- Headers: Set
x-tfy-azure-api-version(e.g.2024-12-01-preview) to match the API version used by your batch deployment. - Endpoint: When creating the job, use
endpoint="/chat/completions"for chat (Azure API uses this form). For the Responses API, useurl: "/v1/responses"andbody.inputin each JSONL line.
Workflow Steps
The batch process follows these steps for all providers:- Upload: Upload JSONL file → Get file ID
- Create: Create batch job → Get batch ID
- Monitor: Check status until complete
- Fetch: Download results
Step-by-Step Examples
1. Upload Input File
1. Upload Input File
2. Create Batch Job
2. Create Batch Job
3. Check Batch Status
3. Check Batch Status
4. Fetch Results
4. Fetch Results
Batch Status Reference
validating: Initial validation of the batchin_progress: Processing the requestscompleted: All requests processed successfullyfailed: Batch processing failed (useerror_file_idfrom the batch response to fetch error details)finalizing: Results being prepared (OpenAI / Azure OpenAI)expired: Did not complete within the time window (OpenAI / Azure OpenAI)cancelling/cancelled: Batch cancelled; partial results may be available viaoutput_file_id
Best Practices
- File Format: Use meaningful
custom_idvalues and valid JSONL format - Error Handling: Implement proper error handling and status monitoring
- Security: Store API keys securely, use minimal IAM permissions
- Provider-specific: Check the prerequisites above (e.g. Azure: batch deployment type and endpoint; Bedrock: minimum 100 records, IAM and S3)
Vertex AI Permissions
The following permissions are required for Vertex AI batch prediction:Storage and service account requirements
Storage and service account requirements
Cloud Storage bucket
- The Cloud Storage bucket used for batch input and output must be in the same region as the Vertex AI model you are using.
- The service account that runs the batch job must have read and write access to this bucket (e.g.
roles/storage.objectAdminor equivalent object-level read/write permissions on the bucket).
Batch prediction job permissions
Batch prediction job permissions
Create batch prediction jobs
- The service account must have permission to create and manage batch prediction jobs in Vertex AI (e.g.
roles/aiplatform.useror theaiplatform.batchPredictionJobs.createpermission).
AWS Bedrock Permissions
User Permissions (for API calls)
User Permissions (for API calls)
These are the minimum permissions required to use the Bedrock Batch APIs. For complete official guidance, see AWS Bedrock Batch Inference Permissions.
Service Role Permissions (for batch execution)
Service Role Permissions (for batch execution)
The service role (role_arn) used for creating and executing the batch job requires:Trust Relationship:Permission Policy: