Intro to LLM Gateway

LLM Gateway provides a unified interface to manage your organization's LLM usage. Below are the key features that enhance functionality and security:

  • Unified API: Access multiple LLM providers through a single OpenAI compatible interface, requiring no code changes.

  • API Key Security: Secure and centralized management of credentials.

  • Governance & Control: Set limits, enforce access controls, and apply content filtering to manage how LLMs are used within your organization.

  • Rate Limiting: Implement measures to prevent abuse and ensure fair usage across users.

  • Observability: Track and analyse usage, costs, latency, and overall performance.

  • Cost Management: Monitor spending and configure budget alerts to keep expenses under control.

  • Audit Trails: Maintain logs of all interactions with LLMs to support compliance and auditing requirements.


LLM Playground

The LLM Playground is a UI for the LLM Gateway where you can tryout different models you've added from across providers like OpenAI, Mistral, Cohere etc. Below is an overview of the features:

  1. Support for multiple model types
    1. Chat Models
    2. Completion Models
    3. Embedding Models
  2. Image Upload: Upload images for image captioning or visual question answering. This is only available for models that support images such as GPT-4o.
  3. Model Comparison: Compare responses from different completion models to evaluate their performance.
  4. System Prompts: Use predefined system prompts to guide model behaviour. System prompt inform how the model should respond. Sample system prompt - Be clear, concise, and polite in your responses. Avoid sharing any sensitive or personal information.

LLM Gateway is Blazing Fast!

  • Near-Zero Overhead: TrueFoundry LLM Gateway adds only extra 3 ms in latency upto 250 RPS and 4 ms at RPS > 300.
  • Scalability: LLM Gateway can scale without any degradation in performance until about 350 RPS with 1 vCPU & 1 GB machine before the CPU utilisation reaches 100% and latencies start to get affected. With more CPU or more replicas, the LLM Gateway can scale to tens of thousands of requests per second.
  • High Capacity: A t2.2xlarge AWS instance (43$ per month on spot) machine can scale upto ~3000 RPS with no issues.
  • Edge Ready: Deploy close to your apps
Gateway can be deployed on the edge, close to your applications

Gateway can be deployed on the edge, close to your applications

🚀

Learn more on LLM Gateway Benchmarks here: Read more

Supported Providers

Below is a comprehensive list of popular LLM providers that is supported by TrueFoundry LLM Gateway.

ProviderStreaming SupportedHow to add models from this provider?
GCP LogoGCP✅https://docs.truefoundry.com/docs/integration-provider-gcp#google-vertex-model-integration
AWS LogoAWS✅https://docs.truefoundry.com/docs/integration-provider-aws#policies-required-for-bedrock-integration
Azure LogoAzure OpenAI✅By using provider API Key in TrueFoundry Integrations
TrueFoundry LogoSelf Hosted Models on TrueFoundry✅https://docs.truefoundry.com/docs/add-self-hosted-model-to-gateway
OpenAI LogoOpenAI✅By using provider API Key in TrueFoundry Integrations
Cohere LogoCohere✅By using provider API Key in TrueFoundry Integrations
AI21 LogoAI21✅By using provider API Key in TrueFoundry Integrations
Anthropic LogoAnthropic✅By using provider API Key in TrueFoundry Integrations
Anyscale LogoAnyscale✅By using provider API Key in TrueFoundry Integrations
Together AI LogoTogether AI✅By using provider API Key in TrueFoundry Integrations
DeepInfra LogoDeepInfra✅By using provider API Key in TrueFoundry Integrations
Ollama LogoOllama✅By using provider API Key in TrueFoundry Integrations
Palm LogoPalm✅By using provider API Key in TrueFoundry Integrations
Perplexity AI LogoPerplexity AI✅By using provider API Key in TrueFoundry Integrations
Mistral AI LogoMistral AI✅By using provider API Key in TrueFoundry Integrations
Groq LogoGroq✅By using provider API Key in TrueFoundry Integrations
Nomic LogoNomic✅By using provider API Key in TrueFoundry Integrations