Intro to LLM Gateway
LLM Gateway provides a unified interface to manage your organization's LLM usage. Below are the key features that enhance functionality and security:
-
Unified API: Access multiple LLM providers through a single OpenAI compatible interface, requiring no code changes.
-
API Key Security: Secure and centralized management of credentials.
-
Governance & Control: Set limits, enforce access controls, and apply content filtering to manage how LLMs are used within your organization.
-
Rate Limiting: Implement measures to prevent abuse and ensure fair usage across users.
-
Observability: Track and analyse usage, costs, latency, and overall performance.
-
Cost Management: Monitor spending and configure budget alerts to keep expenses under control.
-
Audit Trails: Maintain logs of all interactions with LLMs to support compliance and auditing requirements.
LLM Playground
The LLM Playground is a UI for the LLM Gateway where you can tryout different models you've added from across providers like OpenAI, Mistral, Cohere etc. Below is an overview of the features:
- Support for multiple model types
- Chat Models
- Completion Models
- Embedding Models
- Image Upload: Upload images for image captioning or visual question answering. This is only available for models that support images such as GPT-4o.
- Model Comparison: Compare responses from different completion models to evaluate their performance.
- System Prompts: Use predefined system prompts to guide model behaviour. System prompt inform how the model should respond. Sample system prompt - Be clear, concise, and polite in your responses. Avoid sharing any sensitive or personal information.
Benchmarking Results (LLM Gateway is Blazing Fast!)
- Near-Zero Overhead: TrueFoundry LLM Gateway adds only extra 3 ms in latency upto 250 RPS and 4 ms at RPS > 300.
- Scalability: LLM Gateway can scale without any degradation in performance until about 350 RPS with 1 vCPU & 1 GB machine before the CPU utilisation reaches 100% and latencies start to get affected. With more CPU or more replicas, the LLM Gateway can scale to tens of thousands of requests per second.
- High Capacity: A t2.2xlarge AWS instance (43$ per month on spot) machine can scale upto ~3000 RPS with no issues.
- Edge Ready: Deploy close to your apps

Gateway can be deployed on the edge, close to your applications
Learn more on LLM Gateway Benchmarks here: Read more
Unified API
The Unified API in the TrueFoundry LLM Gateway provides a standardised interface for accessing and interacting with different language models from various providers. This means you can seamlessly switch between models and providers without changing your application's code structure. By abstracting the underlying complexities, the Unified API simplifies the process of integrating multiple models and ensures consistency in how they are accessed and utilised.
Key Features of the Unified API
- Standardisation: The Unified API standardises requests and responses across different models, making it easier to manage and integrate multiple models.
- Flexibility: Easily switch between different models and providers without altering the core application code.
- Efficiency: Streamline the development process by using a single API to interact with various models, reducing the need for multiple integrations and bespoke handling.
- Compatibility: By adhering to the OpenAI request-response format, the Unified API ensures seamless integration with Python libraries like OpenAI and LangChain.
Supported Providers
Below is a comprehensive list of popular LLM providers that is supported by TrueFoundry LLM Gateway.
Provider | Streaming Supported | How to add models from this provider? | |
---|---|---|---|
GCP | ✅ | https://docs.truefoundry.com/docs/integration-provider-gcp#google-vertex-model-integration | |
AWS | ✅ | https://docs.truefoundry.com/docs/integration-provider-aws#policies-required-for-bedrock-integration | |
Azure OpenAI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Self Hosted Models on TrueFoundry | ✅ | https://docs.truefoundry.com/docs/add-self-hosted-model-to-gateway | |
OpenAI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Cohere | ✅ | By using provider API Key in TrueFoundry Integrations | |
AI21 | ✅ | By using provider API Key in TrueFoundry Integrations | |
Anthropic | ✅ | By using provider API Key in TrueFoundry Integrations | |
Anyscale | ✅ | By using provider API Key in TrueFoundry Integrations | |
Together AI | ✅ | By using provider API Key in TrueFoundry Integrations | |
DeepInfra | ✅ | By using provider API Key in TrueFoundry Integrations | |
Ollama | ✅ | By using provider API Key in TrueFoundry Integrations | |
Palm | ✅ | By using provider API Key in TrueFoundry Integrations | |
Perplexity AI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Mistral AI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Groq | ✅ | By using provider API Key in TrueFoundry Integrations | |
Nomic | ✅ | By using provider API Key in TrueFoundry Integrations |
Updated about 1 month ago