What is Codex?
Codex is the official command-line interface (CLI) tool for OpenAI, providing a streamlined way to interact with OpenAI’s language models directly from your terminal. With Truefoundry LLM Gateway integration, you can route your Codex requests via Gateway.Key Features of OpenAI Codex CLI
- Terminal-Native AI Interactions: Codex provides a streamlined command-line interface for interacting with OpenAI’s language models directly from your terminal, enabling developers to integrate AI capabilities into their existing command-line workflows without switching contexts.
- Intelligent Code Generation: Generate code snippets, functions, and entire programs across multiple programming languages based on natural language prompts. The CLI tool understands context and can produce syntactically correct, functional code that follows best practices.
- Streaming and Interactive Sessions: Support for real-time streaming responses and interactive sessions allows developers to iteratively refine prompts and receive updated outputs, enabling a dynamic conversation-like experience for code development and problem-solving directly in the terminal.
Prerequisites
Before integrating Codex with TrueFoundry, ensure you have:- TrueFoundry Account: Create a Truefoundry account with atleast one model provider and generate a Personal Access Token by following the instructions in Generating Tokens. For a quick setup guide, see our Gateway Quick Start
- Codex Installation: Install the Codex CLI on your system
- Load Balance Configuration: Setup load balancing configuration for your desired models (see Setup Process section below)
Why Load Balancing Configuration is Necessary
Codex has internal logic that sends “thinking tokens” to certain models during processing. This works well with standard OpenAI model names (likegpt-4
), but causes compatibility issues with Truefoundry’s fully qualified model names (like openai-main/gpt-4
or azure-openai/gpt-4
).
When Codex encounters these fully qualified model names directly, it incorrectly sends thinking tokens, which can cause unexpected behavior.
The Solution: Load balancing configuration allows you to:
- Call a standard model name in your Codex commands (e.g.,
gpt-4
) - Have Truefoundry Gateway automatically route the request to the fully qualified target model (e.g.,
openai-main/gpt-4
)
Setup Process
1. Configure Environment Variables
To connect Codex with Truefoundry LLM Gateway, set these environment variables:TFY_API_KEY
with your actual Truefoundry API key and {controlPlaneUrl}
with your Truefoundry control plane URL.

Get Base URL and Model Name from Unified Code Snippet
.bashrc
, .zshrc
, etc.) for persistent configuration:
2. Setup Load Balance Configuration
Create a load balancing configuration to route your requests to specific model providers:gpt-4
through Codex, your request will be routed to the openai-main/gpt-4
model with 100% of the traffic weight.
Usage Examples
Basic Usage with Load Balanced Models
Always specify the model defined in your load balancing configuration to ensure your requests go through the Truefoundry Gateway:Advanced Options with Gateway Routing
Understanding Load Balancing
When you use Codex with--model gpt-4
, your request gets load-balanced according to your configuration. In the example configuration above, any request to gpt-4
will be routed to openai-main/gpt-4
with 100% of the traffic.
You can create more sophisticated routing rules with multiple targets and different weights: