This page provides an architecture overview, requirements and steps to setup a TrueFoundry compute plane cluster in AWS
Access Policies Overview
Policy | Description |
---|---|
ELBControllerPolicy | Role assumed by load balancer controller to provision ELB when a service of type LoadBalancer is created |
KarpenterPolicy and SQSPolicy | Role assumed by Karpenter to dynamically provision nodes and handle spot node termination |
EFSPolicy | Role assumed by EFS CSI to provision and attach EFS volumes |
EBSPolicy | Role assumed by EBS CSI to provision and attach EBS volumes |
RolePolicy with policies for:- ECR, S3, SSM, EKS Use the trust relationship. | Role assumed by TrueFoundry to allow access to ECR, S3, and SSM services. If you are using TrueFoundry’s control plane the role will be assumed by arn:aws:iam::416964291864:role/tfy-ctl-euwe1-production-truefoundry-deps otherwise it will be your control plane’s IAM role |
ClusterRole with policies: - AmazonEKSClusterPolicy - AmazonEKSVPCResourceControllerPolicy - EncryptionPolicy | Role that provides Kubernetes permissions to manage the cluster lifecycle, networking, and encryption |
NodeRole with policies: AmazonEC2ContainerRegistryReadOnlyPolicy, AmazonEKS_CNI_Policy, AmazonEKSWorkerNodePolicy, AmazonSSMManagedInstanceCorePolicy | Role assumed by EKS nodes to work with AWS resources for ECR access, IP assignment, and cluster registration |
/24
or larger. This is to ensure capacity for ~250 instances and 4096 pods.public.ecr.aws
, quay.io
, ghcr.io
, tfy.jfrog.io
, docker.io/natsio
, nvcr.io
, registry.k8s.io
so that we can download the docker images for argocd, nats, gpu operator, argo rollouts, argo workflows, istio, keda, etc.services.example.com/tfy/*
, however, many frontend applications do not support this.Choose to create a new cluster or attach an existing cluster
Clusters
. You can click on Create New Cluster
or Attach Existing Cluster
depending on your use case. Read the requirements and if everything is satisfied, click on Continue
.Fill up the form to generate the terraform code
Submit
when doneCluster Name
- A name for your cluster.Region
- The region where you want to create the cluster.Network Configuration
- Choose between New VPC
or Existing VPC
depending on your use case.Authentication
- This is how you are authenticated to AWS on your local machine. It’s used to configure Terraform to authenticate with AWS.S3 Bucket for Terraform State
- Terraform state will be stored in this bucket. It can be a preexisting bucket or a new bucket name. The new bucket will automatically be created by our script.Load Balancer Configuration
- This is to configure the load balancer for your cluster. You can choose between Public
or Private
Load Balancer, it defaults to Public
. You can also add certificate ARNs and domain names for the load balancer but these are optional.Platform Features
- This is to decide which features like BlobStorage, ClusterIntegration, ParameterStore, DockerRegistry and SecretsManager will be enabled for your cluster. To read more on how these integrations are used in the platform, please refer to the platform features page.Copy the curl command and execute it on your local machine
curl
command to download and execute the script. The script will take care of installing the pre-requisites, downloading terraform code and running it on your local machine to create the cluster. This will take around 40-50 minutes to complete.Verify the cluster is showing as connected in the platform
Create DNS Record
Base Domain URL
section.Record Type | Record Name | Record value |
---|---|---|
CNAME | *.tfy.example.com | LOADBALANCER_IP_ADDRESS |
Setup routing and TLS for deploying workloads to your cluster
Start deploying workloads to your cluster