Skip to main content
This guide explains how to perform batch predictions using TrueFoundry’s AI Gateway with OpenAI, Vertex AI, or AWS Bedrock providers.

Client Setup

All providers use the OpenAI SDK with provider-specific headers. Choose your provider to get started:
from openai import OpenAI

client = OpenAI(
    api_key="your-truefoundry-api-key",
    base_url="https://{controlPlaneUrl}/api/llm",
    default_headers={
        "x-tfy-provider-name": "openai-main"  # truefoundry provider integration name
    }
)

Input File Format

Create a JSONL file with one JSON object per line. Each line represents a single request:
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
{"custom_id": "request-3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4-vision-preview", "messages": [{"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}]}], "max_tokens": 1000}}
Requirements: Valid JSON per line, meaningful custom_id values, minimum 100 records for AWS Bedrock.
Before using AWS Bedrock batch processing, ensure you have:
  • S3 Bucket: For storing input and output files
  • IAM Execution Role: With permissions for S3 access and Bedrock model invocation
  • User Permissions: Including iam:PassRole to pass the execution role to Bedrock

Workflow Steps

The batch process follows these steps for all providers:
  1. Upload: Upload JSONL file → Get file ID
  2. Create: Create batch job → Get batch ID
  3. Monitor: Check status until complete
  4. Fetch: Download results

Step-by-Step Examples

from openai import OpenAI

client = OpenAI(
    api_key="your-truefoundry-api-key",
    base_url="https://{controlPlaneUrl}/api/llm",
    default_headers={
        "x-tfy-provider-name": "openai-main"  # truefoundry provider integration name
    }
)

# Upload the input file
file = client.files.create(
    file=open("request.jsonl", "rb"),
    purpose="batch"
)

print(file.id)  # Example: file-PnFGrFLN5LjjcWr4eFsStK
from openai import OpenAI

client = OpenAI(
    api_key="your-truefoundry-api-key",
    base_url="https://{controlPlaneUrl}/api/llm",
    default_headers={
        "x-tfy-provider-name": "openai-main"  # truefoundry provider integration name
    }
)

batch_job = client.batches.create(
    input_file_id=file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

print(batch_job.id)  # Example: batch_67f7bfc50b288190893f242d9fa47c52
from openai import OpenAI

client = OpenAI(
    api_key="your-truefoundry-api-key",
    base_url="https://{controlPlaneUrl}/api/llm",
    default_headers={
        "x-tfy-provider-name": "openai-main"  # truefoundry provider integration name
    }
)

batch_status = client.batches.retrieve(batch_job.id)
print(batch_status.status)  # Example: completed, validating, in_progress, etc.
from openai import OpenAI

client = OpenAI(
    api_key="your-truefoundry-api-key",
    base_url="https://{controlPlaneUrl}/api/llm",
    default_headers={
        "x-tfy-provider-name": "openai-main"  # truefoundry provider integration name
    }
)

if batch_status.status == "completed":
    output_content = client.files.content(batch_status.output_file_id)
    print(output_content.content.decode('utf-8'))

Batch Status Reference

  • validating: Initial validation of the batch
  • in_progress: Processing the requests
  • completed: All requests processed successfully
  • failed: Batch processing failed

Best Practices

  1. File Format: Use meaningful custom_id values and valid JSONL format
  2. Error Handling: Implement proper error handling and status monitoring
  3. Security: Store API keys securely, use minimal IAM permissions
  4. AWS Bedrock Specific:
    • Minimum 100 records required in JSONL file
    • Verify IAM roles and S3 bucket permissions

AWS Bedrock Permissions

These are the minimum permissions required to use the Bedrock Batch APIs. For complete official guidance, see AWS Bedrock Batch Inference Permissions.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:ListFoundationModels",
        "bedrock:GetFoundationModel",
        "bedrock:ListInferenceProfiles",
        "bedrock:GetInferenceProfile",
        "bedrock:ListCustomModels",
        "bedrock:GetCustomModel",
        "bedrock:TagResource",
        "bedrock:UntagResource",
        "bedrock:ListTagsForResource",
        "bedrock:CreateModelInvocationJob",
        "bedrock:GetModelInvocationJob",
        "bedrock:ListModelInvocationJobs",
        "bedrock:StopModelInvocationJob"
      ],
      "Resource": [
        "arn:aws:bedrock:<region>:<account_id>:model-customization-job/*",
        "arn:aws:bedrock:<region>:<account_id>:custom-model/*",
        "arn:aws:bedrock:<region>::foundation-model/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": ["s3:ListBucket", "s3:PutObject", "s3:GetObject", "s3:GetObjectAttributes"],
      "Resource": ["arn:aws:s3:::<bucket>", "arn:aws:s3:::<bucket>/*"]
    },
    {
      "Action": ["iam:PassRole"],
      "Effect": "Allow",
      "Resource": "arn:aws:iam::<account_id>:role/<service_role_name>",
      "Condition": {
        "StringEquals": {
          "iam:PassedToService": ["bedrock.amazonaws.com"]
        }
      }
    }
  ]
}
The service role (role_arn) used for creating and executing the batch job requires:Trust Relationship:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "bedrock.amazonaws.com"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "<account_id>"
        },
        "ArnEquals": {
          "aws:SourceArn": "arn:aws:bedrock:<region>:<account_id>:model-invocation-job/*"
        }
      }
    }
  ]
}
Permission Policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"],
      "Resource": ["arn:aws:s3:::<bucket>", "arn:aws:s3:::<bucket>/*"]
    }
  ]
}
I