Following is the list of requirements to set up compute plane in your Generic cluster

GenericInfra Requirements

Existing Cluster

RequirementsDescriptionReason for requirement
Kubernetes VersionKubernetes version 1.30 or higherRequired for latest security features and compatibility
Worker Nodes- Minimum 3 worker nodes
- Each worker node: 4 vCPUs, 16GB RAM
- For GPU workloads: NVIDIA GPU-enabled nodes
Required for running core Truefoundry components and user workloads
Block storageBlock storage for the worker nodes and persistent volumesRequired to ensure that the worker nodes and deployments have enough storage to run the workloads. Min 100GB per worker node and persistent volumes as needed
NFS storageNFS storage for artifact cacheRequired to ensure that the ML deployment artifacts have enough storage to run
Egress access For Docker Registry- public.ecr.aws
- quay.io
- ghcr.io
- docker.io/truefoundrycloud
- docker.io/natsio
- nvcr.io
- registry.k8s.io
This is to download docker images for Truefoundry, ArgoCD, NATS, GPU operator, ArgoRollouts, ArgoWorkflows, Istio, Keda.
Load BalancerLoad Balancer for the servicesRequired to ensure that the services are accessible from the internet if required to exposed. Load balancer should get pre-filled with the IP address given by address pool
DNSDomain for service endpointsExamples: *.internal.example.com, *.external.example.com, tfy.example.com. Wildcard preferred for developer service deployments
CertificateCertificate for the domainsRequired for terminating TLS traffic to the services. Can be managed through cert-manager or custom certificates. Check here for more details.