Skip to main content
This guide provides instructions for integrating OpenAI Swarm with the Truefoundry AI Gateway.

What is OpenAI Swarm?

OpenAI Swarm is an experimental framework for building, orchestrating, and deploying multi-agent systems. It provides a lightweight, scalable, and highly customizable approach to coordinating multiple AI agents that can work together to solve complex tasks through handoffs and collaboration.

Key Features of OpenAI Swarm

  • Multi-Agent Coordination: Build teams of specialized agents that can transfer tasks between each other and collaborate on complex workflows
  • Lightweight Framework: Minimal overhead with a simple API that makes it easy to define agents, their capabilities, and coordination patterns
  • Function Calling: Agents can be equipped with custom functions and tools to interact with external systems and APIs
  • Contextual Handoffs: Seamless transfer of context and conversation state between agents based on user needs and agent capabilities

Prerequisites

Before integrating OpenAI Swarm with TrueFoundry, ensure you have:
  1. TrueFoundry Account: Create a Truefoundry account and follow our Gateway Quick Start
  2. OpenAI Swarm Installation: Install OpenAI Swarm using pip: pip install git+https://github.com/openai/swarm.git
  3. Load Balance Configuration: Setup load balancing configuration for your desired models (see Setup Process section below)

Important: Model Routing Configuration

Routing Configuration Required: OpenAI Swarm uses standard OpenAI model names (like gpt-4o, gpt-4o-mini) in its internal logic. To route these requests through TrueFoundry Gateway to your specific model providers, you need to set up a load balancing configuration on TrueFoundry. This maps standard model names to your TrueFoundry model format (e.g., openai-main/gpt-4o).For detailed information about load balancing configurations, see our Load Balancing Documentation.

Setup Process

1. Basic Setup with OpenAI Swarm

You will get your ‘truefoundry-api-key’, ‘truefoundry-gateway-url’ and model name directly from the unified code snippet
Configure the OpenAI client with TrueFoundry gateway settings and use it with Swarm:
from swarm import Swarm, Agent
import os
from openai import OpenAI

# Get API key from environment variable
api_key = os.getenv("OPENAI_API_KEY")  # Set this to your TrueFoundry API key

# Initialize OpenAI client with TrueFoundry gateway
openai_client = OpenAI(
    api_key="your-truefoundry-api-key",
    base_url="your-truefoundry-base-url",
)

# Initialize Swarm with the TrueFoundry-configured OpenAI client
client = Swarm(client=openai_client)

def transfer_to_agent_b():
    return agent_b

agent_a = Agent(
    name="Agent A",
    instructions="You are a helpful agent.",
    functions=[transfer_to_agent_b],
)

agent_b = Agent(
    name="Agent B",
    instructions="Only speak in Haikus.",
)

response = client.run(
    agent=agent_a,
    messages=[{"role": "user", "content": "I want to talk to agent B."}],
)

print(response.messages[-1]["content"])
Replace:
  • your-truefoundry-api-key with your actual TrueFoundry API key
  • your-truefoundry-base-url with your TrueFoundry Gateway URL

2. Configure Model Routing

Create a load balancing configuration to route standard OpenAI model names to your TrueFoundry providers:
name: swarm-load-balancing-config
type: gateway-load-balancing-config
rules:
  - id: swarm-gpt4o-routing
    type: weight-based-routing
    when:
      models:
        - gpt-4o
    load_balance_targets:
      - target: openai-main/gpt-4o
        weight: 100

3. Environment Variables Configuration

For persistent configuration across your Swarm applications, set these environment variables:
export OPENAI_API_KEY="your-truefoundry-api-key"
export OPENAI_BASE_URL="your-truefoundry-base-url"

Observability and Governance

Monitor your OpenAI Swarm applications through TrueFoundry’s metrics tab: TrueFoundry metrics With TrueFoundry’s AI gateway, you can monitor and analyze:
  • Performance Metrics: Track key latency metrics like Request Latency, Time to First Token (TTFS), and Inter-Token Latency (ITL) with P99, P90, and P50 percentiles
  • Cost and Token Usage: Gain visibility into your application’s costs with detailed breakdowns of input/output tokens and the associated expenses for each model
  • Usage Patterns: Understand how your application is being used with detailed analytics on user activity, model distribution, and team-based usage
  • Agent Performance: Monitor individual agent performance and handoff patterns
  • Rate limit and Load balancing: Set up rate limiting, load balancing and fallback for your models
I