GCP
Provisioning Control Plane Infrastructure on GCP
Infrastructure requirements:
Requirements | Description | Reason for Requirement |
---|---|---|
Kubernetes Cluster | Any Kubernetes cluster will work here - we can also choose the compute-plane cluster itself to install Truefoundry helm chart. | The Truefoundry helm chart will be installed here. |
CloudSQL Postgres | Postgres >= 13 | The database is used by Truefoundry control plane to store all its metadata. |
GCS bucket | Any GCS bucket reachable from control-plane. | This is used by control-plane to store the intermediate code while building the docker image. |
Egress Access for TruefoundryAuth | Egress access to https://auth.truefoundry.com | This is needed to validate the users logging into Truefoundry so that licensing can be maintained. |
Egress access For Docker Registry | 1 public.ecr.aws 2. quay.io 3. ghcr.io 4. docker.io/truefoundrycloud 5. docker.io/natsio 6. nvcr.io 7. registry.k8s.io | This is to download docker images for Truefoundry, ArgoCD, NATS, ArgoRollouts, ArgoWorkflows, Istio. |
DNS with TLS/SSL | One endpoint to point to the control plane service (something like platform.example.com where example.com is your domain. There should also be a certificate with the domain so that the domains can be accessed over TLS. The control-plane url should be reachable from the compute-plane so that compute-plane cluster can connect to the control-plane | The developers will need to access the Truefoundry UI at domain that is provided here. |
User/ServiceAccount to provision the infrastructure | - Cloud SQL Admin - Security Admin - Service Account Admin - Service Account Token Creator - Service Account User - Storage Admin | These are the permissions required by the IAM user in GCP to create the entire control plane components. |
GCP Infra Architecture
Create the infrastructure:
You can follow either of the approaches below to create the infrastructure:
- Use OCLI which uses Terraform to spin up the infrastructure
- Do it your yourself manually using the steps provided below:
Manually Spin up the Infrastructure:
We only recommend this process if you cannot use OCLI for some reason. Please follow the steps below to spin the up the infrastructure:
- Create a Kubernetes Cluster with NAP enabled.
- Spin up CloudSQL Postgres DB with postgres version >= 13
- Create an IAM serviceaccount with access to GCS bucket, artifact registry and Secrets manager (optional). For control plane you can add role
roles/iam.workloadIdentityUser
for below identities"serviceAccount:${var.project_id}.svc.id.goog[truefoundry/servicefoundry-server]", "serviceAccount:${var.project_id}.svc.id.goog[truefoundry/mlfoundry-server]",
You can then contact the TrueFoundry team to install TrueFoundry on the cluster.
Updated 3 days ago