Intro to LLM Gateway
LLM Gateway provides a unified interface to manage your organization's LLM usage. Below are the key features that enhance functionality and security:
-
Unified API: Access multiple LLM providers through a single OpenAI compatible interface, requiring no code changes.
-
API Key Security: Secure and centralized management of credentials.
-
Governance & Control: Set limits, enforce access controls, and apply content filtering to manage how LLMs are used within your organization.
-
Rate Limiting: Implement measures to prevent abuse and ensure fair usage across users.
-
Observability: Track and analyse usage, costs, latency, and overall performance.
-
Cost Management: Monitor spending and configure budget alerts to keep expenses under control.
-
Audit Trails: Maintain logs of all interactions with LLMs to support compliance and auditing requirements.
LLM Playground
The LLM Playground is a UI for the LLM Gateway where you can tryout different models you've added from across providers like OpenAI, Mistral, Cohere etc. Below is an overview of the features:
- Support for multiple model types
- Chat Models
- Completion Models
- Embedding Models
- Image Upload: Upload images for image captioning or visual question answering. This is only available for models that support images such as GPT-4o.
- Model Comparison: Compare responses from different completion models to evaluate their performance.
- System Prompts: Use predefined system prompts to guide model behaviour. System prompt inform how the model should respond. Sample system prompt - Be clear, concise, and polite in your responses. Avoid sharing any sensitive or personal information.
LLM Gateway is Blazing Fast!
- Near-Zero Overhead: TrueFoundry LLM Gateway adds only extra 3 ms in latency upto 250 RPS and 4 ms at RPS > 300.
- Scalability: LLM Gateway can scale without any degradation in performance until about 350 RPS with 1 vCPU & 1 GB machine before the CPU utilisation reaches 100% and latencies start to get affected. With more CPU or more replicas, the LLM Gateway can scale to tens of thousands of requests per second.
- High Capacity: A t2.2xlarge AWS instance (43$ per month on spot) machine can scale upto ~3000 RPS with no issues.
- Edge Ready: Deploy close to your apps
Learn more on LLM Gateway Benchmarks here: Read more
Supported Providers
Below is a comprehensive list of popular LLM providers that is supported by TrueFoundry LLM Gateway.
Provider | Streaming Supported | How to add models from this provider? | |
---|---|---|---|
GCP | ✅ | https://docs.truefoundry.com/docs/integration-provider-gcp#google-vertex-model-integration | |
AWS | ✅ | https://docs.truefoundry.com/docs/integration-provider-aws#policies-required-for-bedrock-integration | |
Azure OpenAI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Self Hosted Models on TrueFoundry | ✅ | https://docs.truefoundry.com/docs/add-self-hosted-model-to-gateway | |
OpenAI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Cohere | ✅ | By using provider API Key in TrueFoundry Integrations | |
AI21 | ✅ | By using provider API Key in TrueFoundry Integrations | |
Anthropic | ✅ | By using provider API Key in TrueFoundry Integrations | |
Anyscale | ✅ | By using provider API Key in TrueFoundry Integrations | |
Together AI | ✅ | By using provider API Key in TrueFoundry Integrations | |
DeepInfra | ✅ | By using provider API Key in TrueFoundry Integrations | |
Ollama | ✅ | By using provider API Key in TrueFoundry Integrations | |
Palm | ✅ | By using provider API Key in TrueFoundry Integrations | |
Perplexity AI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Mistral AI | ✅ | By using provider API Key in TrueFoundry Integrations | |
Groq | ✅ | By using provider API Key in TrueFoundry Integrations | |
Nomic | ✅ | By using provider API Key in TrueFoundry Integrations |
Updated 3 days ago