TrueFoundry AI Gateway provides a wide range of features to help you manage your LLM models and use them in your applications. Here are some of the key features:

Unified API Interface

Single API interface to access multiple LLM providers with unified endpoint

Rate Limiting

Control Models Usage with flexible rate limiting policies per user/model/application

Budget Limiting

Control spending and enforce cost limits for users, teams, and models

Multimodal Inputs

Support for text, image, and audio inputs across compatible models

Fallback

Automatic failover to backup models when primary models are unavailable

Load Balancing

Distribute requests across multiple model instances for optimal performance

API Key Management

Generate and manage API keys for users/applications

Guardrails

Content filtering and safety checks to ensure

Observability & Metrics

Comprehensive monitoring, logging, and analytics for all API requests

Access Control

Fine-grained access control and permissions management

Custom Metadata Routing

Route requests based on custom metadata and business logic

Latency-Based Load Balancing

Intelligent routing based on real-time latency metrics

Prompt Management

Centralized prompt versioning and management system

Tracing

Distributed tracing for debugging and performance optimization

Responses API

Advanced response handling and transformation capabilities

Batch Predictions

Process multiple requests efficiently with batch processing

PII Detection

Automatically detect and filter personally identifiable information

Real-time API

Low-latency streaming responses for real-time applications

Export Logs & Traces

Export request logs and traces for external analysis and compliance

Self-Hosted Models

Deploy and manage your own custom models alongside public providers

MCP Servers