Deploy AI Gateway Only
Compute Requirements
To install the control-plane, we need a Kubernetes cluster and a managed Postgres database. Truefoundry ships as a helm chart (https://github.com/truefoundry/infra-charts/tree/main/charts/truefoundry) that has configurable options to either deploy both Deployment and AI Gateway feature or just choose the one of them according to your needs. The compute requirements change based on the set of features and the scale of the number of users and requests.
Here are a few scenarios that you can choose from based on your needs.
The small tier is recommended for development purposes. Here all the components are deployed on Kubernetes and in non HA mode (single replica). This is suitable if you are just testing out the different features of Truefoundry.
This setup brings up 1 replica of the services and is not highly-available. It can enable you to test the features but we do not recommend this for production mode.
Component | CPU | Memory | Storage | Min Nodes | Remarks |
---|---|---|---|---|---|
Helm-Chart (AI Gateway Control Plane components) | 2 vCPU | 8GB | 60GB Persistent Volumes (Block Storage) On Kubernetes | 2 Pods should be spread over min 2 nodes | Cost: ~ $120 pm |
Helm-Chart (AI Gateway component only) | 1 vCPU | 512Mi | - | 1 Pods should be spread over min 1 node | Cost: ~ $10 pm |
Postgres (Deployed on Kubernetes) | 0.5 vCPU | 0.5GB | 5GB Persistent Volumes (Block Storage) On Kubernetes | PostgreSQL version >= 13 | |
Blob Storage (S3 Compatible) | 20GB |
The small tier is recommended for development purposes. Here all the components are deployed on Kubernetes and in non HA mode (single replica). This is suitable if you are just testing out the different features of Truefoundry.
This setup brings up 1 replica of the services and is not highly-available. It can enable you to test the features but we do not recommend this for production mode.
Component | CPU | Memory | Storage | Min Nodes | Remarks |
---|---|---|---|---|---|
Helm-Chart (AI Gateway Control Plane components) | 2 vCPU | 8GB | 60GB Persistent Volumes (Block Storage) On Kubernetes | 2 Pods should be spread over min 2 nodes | Cost: ~ $120 pm |
Helm-Chart (AI Gateway component only) | 1 vCPU | 512Mi | - | 1 Pods should be spread over min 1 node | Cost: ~ $10 pm |
Postgres (Deployed on Kubernetes) | 0.5 vCPU | 0.5GB | 5GB Persistent Volumes (Block Storage) On Kubernetes | PostgreSQL version >= 13 | |
Blob Storage (S3 Compatible) | 20GB |
The medium tier is configured for production and will suffice teams of 10-500 members.
The AI Gateway is configured for a minimum 3 replicas (1 vCPU 1GB each) which can handle around 500 requests/second to LLMs.
It’s configured to be horizontally scalable and autoscale when the load increases. The Block Storage and S3 are used to store LLM request logs. The size is dependent on the size and number of requests and should be set as per the expected usage.
Component | CPU | Memory | Storage | Min Nodes | Remarks |
---|---|---|---|---|---|
Helm-Chart (AI Gateway Control Plane components) | 16 vCPU | 48GB | 250GB Persistent Volumes (Block Storage) On Kubernetes | 3 Pods should be spread over min 3 nodes | Cost: ~ $800 pm |
Helm-Chart (AI Gateway component only) | 3 vCPU | 3GB | - | 3 Pods should be spread over min 3 nodes | Cost: ~ $30 pm |
Postgres (Managed Database) | 2 vCPU | 4GB | 30GB Persistent Volumes (Block Storage) On Kubernetes | PostgreSQL version >= 13 | |
Blob Storage (S3 Compatible) | 500GB |
The large tier is configured for production and will suffice organizations of 500-50000 members.
The AI Gateway is configured for a minimum 10 replicas (1 vCPU 1GB each) which can handle around 2000 requests/second to LLMs.
It’s configured to be horizontally scalable and autoscale when the load increases. The Block Storage and S3 are used to store LLM request logs. The size is dependent on the size and number of requests and should be set as per the expected usage.
Component | CPU | Memory | Storage | Min Nodes | Remarks |
---|---|---|---|---|---|
Helm-Chart (AI Gateway Control Plane components) | 32 vCPU | 64GB | 400GB Persistent Volumes (Block Storage) On Kubernetes | 10 Pods should be spread over min 10 nodes | Cost: ~ $1400 pm |
Helm-Chart (AI Gateway component only) | 10 vCPU | 10GB | - | 10 Pods should be spread over min 10 nodes | Cost: ~ $100 pm |
Postgres (Managed Database) | 2 vCPU | 4GB | 30GB Persistent Volumes (Block Storage) On Kubernetes | PostgreSQL version >= 13 | |
Blob Storage (S3 Compatible) | 1000GB |
Prerequisites for Installation
- Egress access to auth.truefoundry.com and analytics.truefoundry.com
- Domain to map the ingress of the frontend and gateway
- Tenant, Licence key, and image pull secret from TrueFoundry team
Installation Instructions
- Create a values file as given below and replace relevant values:
- Add helm repo
- Install the helm chart with your values file