You can deploy Truefoundry AI Gateway in the following modes:
  1. Truefoundry Managed SAAS
  2. Truefoundry Managed Gateway with data storage on your own infrastructure.
  3. Gateway Plane and Data Storage on your own infrastructure.
  4. Control Plane, Gateway Plane and Data Storage on your own infrastructure.

Option 1: Truefoundry Managed SAAS

This is a fully managed solution on Truefoundry’s secure cloud infrastructure with enterprise-grade features.
Truefoundry Managed SAAS

Truefoundry Managed SAAS

This is ideal for smaller, mid-size or entprises that want to use Truefoundry AI gateway without the operational overhead of self-hosting.
The key features and advantages of this deployment option are:
  1. Globally distributed gateway to minimize latency: Truefoundry gateway is deployed in multiple regions of the world across multiple zones and multiple cloud providers to provide low latency and high availability.
Multi region and cloud deployment of Gateway
  1. Zero Overhead of maintenance: There is no overhead of maintaining infrastructure and you can get access to the latest features and improvements.
  2. Data is encrypted at rest and in transit.
  3. Truefoundry Infrastructure is SOC2, ISO27001, GDPR, and HIPAA compliant

Option 2: Truefoundry Managed Gateway with data storage on your own infrastructure.

In this case, the gateway and control-plane are managed by Truefoundry on its own infrastructure, but the LLM data is stored on your own infrastructure.
Truefoundry Managed SAAS

Truefoundry Managed Gateway with data storage on your own infrastructure

Tha advantages of this deployment option are as follows:
  1. No Infrastructure Management: In this case, the gateway is globally distributed, available and fully managed by Truefoundry. You just need to provide the blob storage(AWS S3 Bucket, or GCS Bucket or Azure Blob Storage or any other S3 compatible storage) and the control-plane will use it to store the LLM data.
  2. You retain data ownership: You retain full ownership of the request-response data since the its stored on a bucket on your end. The data are stored in parquet format - so you can use them for analytics, debugging and evaluation via Spark, DuckDB, Athena or any tool of your choice.
The actual request response data in this case flows through Truefoundry control-plane and then gets stored in a blob storage on your infrastructure. The control-plane might cache some of the data for faster queries, but doesn’t do any long term retention of the data.When you are browsing the request logs in the Truefoundry dashboard, the data will be fetched from your blob storage - so you might incur egress charges from your cloud provider.

Option 3: Gateway Plane and Data Storage on your own infrastructure.

In this case, the gateway and the data storage are hosted on your own infrastructure. The gateway exports the request-response data to the ingestion server which then stores the data in your own blob storage. The control-plane stores the metrics and has access to the bucket containing the request-response data.
Gateway Plane and Data Storage on your own infrastructure

Gateway Plane and Data Storage on your own infrastructure

The key features about this mode of deployment are:
  1. LLM Traffic stays within your own premises: All LLM traffic stays within your own infrastructure and Truefoundry doesn’t come into the live path of a request to LLM.
  2. You retain full control over your data: You retain full ownership of the request-response data since the its stored on a bucket on your end. The data are stored in parquet format - so you can use them for analytics, debugging and evaluation via Spark, DuckDB, Athena or any tool of your choice.
  3. Management of Gateway and Ingestion Service: The gateway and ingestion service availability needs to be managed on your end. In case you are operating in multiple regions, you will need to deploy the gateway in multiple regions. (The Ingestion service doesn’t need to be deployed in all regions.)
  4. Truefoundry control plane has access to the bucket containing the data: This access helps you browse the request logs on the truefoundry dashboard.
You will not be be able to use this feature if you don’t give access to Truefoundry control plane access to your bucket.
The actual request response data in this case does not flow through the control-plane and directly gets stored to the blob storage.When you are browsing the request logs in the Truefoundry dashboard, the data will be fetched from your blob storage - so you might incur egress charges from your cloud provider. The data might be cached temporarily in the control-plane for faster queries.

Option 4: Control Plane, Gateway Plane and Data Storage on your own infrastructure.

In this case, everything except the authentication/licensing server, everything is hosted on your own infrastructure.
Control Plane, Gateway Plane and Data Storage on your own infrastructure

Control Plane, Gateway Plane and Data Storage on your own infrastructure

The only data sent to authentication/licensing server are the emails of the employees using the platform and the count of the requests flowing through the gateway. To understand how SSO works with our central authentication server, refer to this page. This helps us keep track of the licenses and billing.This mode of deployment works perfectly for airgapped environments.