Generic
Requirements
Requirements for Truefoundry installation on Generic
Following is the list of requirements to set up compute plane in your Generic cluster
GenericInfra Requirements
Existing Cluster
Requirements | Description | Reason for requirement |
---|---|---|
Kubernetes Version | Kubernetes version 1.30 or higher | Required for latest security features and compatibility |
Worker Nodes | - Minimum 3 worker nodes - Each worker node: 4 vCPUs, 16GB RAM - For GPU workloads: NVIDIA GPU-enabled nodes | Required for running core Truefoundry components and user workloads |
Block storage | Block storage for the worker nodes and persistent volumes | Required to ensure that the worker nodes and deployments have enough storage to run the workloads. Min 100GB per worker node and persistent volumes as needed |
NFS storage | NFS storage for artifact cache | Required to ensure that the ML deployment artifacts have enough storage to run |
Egress access For Docker Registry | - public.ecr.aws - quay.io - ghcr.io - docker.io/truefoundrycloud - docker.io/natsio - nvcr.io - registry.k8s.io | This is to download docker images for Truefoundry, ArgoCD, NATS, GPU operator, ArgoRollouts, ArgoWorkflows, Istio, Keda. |
Load Balancer | Load Balancer for the services | Required to ensure that the services are accessible from the internet if required to exposed. Load balancer should get pre-filled with the IP address given by address pool |
DNS | Domain for service endpoints | Examples: *.internal.example.com , *.external.example.com , tfy.example.com . Wildcard preferred for developer service deployments |
Certificate | Certificate for the domains | Required for terminating TLS traffic to the services. Can be managed through cert-manager or custom certificates. Check here for more details. |