This guide explains how to perform batch predictions using TrueFoundry’s AI Gateway with different providers.
openai
library installedAll API requests require authentication using your TrueFoundry API key and provider integration name. This is handled through the OpenAI client configuration:
When making requests, you’ll need to specify provider-specific headers based on which LLM provider you’re using:
The batch prediction system requires input files in JSONL (JSON Lines) format. Each line in the file must be a valid JSON object representing a single request. The file should not contain any empty lines or comments.
Example of a valid JSONL file (request.jsonl
):
When using Vertex AI, you can skip method, url and body.model fields since they are not used.
The batch prediction process involves four main steps:
Upload your JSONL file using the OpenAI client:
Create a batch job using the file ID from the upload step:
Monitor the batch job status:
The status can be one of:
validating
: Initial validation of the batchin_progress
: Processing the requestscompleted
: All requests processed successfullyfailed
: Batch processing failedOnce the batch is completed, fetch the results:
Here’s a complete example that puts it all together:
custom_id
values in your JSONL requests to track individual requestsThis guide explains how to perform batch predictions using TrueFoundry’s AI Gateway with different providers.
openai
library installedAll API requests require authentication using your TrueFoundry API key and provider integration name. This is handled through the OpenAI client configuration:
When making requests, you’ll need to specify provider-specific headers based on which LLM provider you’re using:
The batch prediction system requires input files in JSONL (JSON Lines) format. Each line in the file must be a valid JSON object representing a single request. The file should not contain any empty lines or comments.
Example of a valid JSONL file (request.jsonl
):
When using Vertex AI, you can skip method, url and body.model fields since they are not used.
The batch prediction process involves four main steps:
Upload your JSONL file using the OpenAI client:
Create a batch job using the file ID from the upload step:
Monitor the batch job status:
The status can be one of:
validating
: Initial validation of the batchin_progress
: Processing the requestscompleted
: All requests processed successfullyfailed
: Batch processing failedOnce the batch is completed, fetch the results:
Here’s a complete example that puts it all together:
custom_id
values in your JSONL requests to track individual requests