Create an Integration of Blob Storage on the platform ( from Integrations section).
Ignore this step if you already have the integration added.
In the “Clusters” section, Edit your cluster and Update the “Workflow Storage Integration” with the integration from Step 1.
Install Workflow Propeller in the cluster( Dataplane Component of Flyte ) by following the instruction below (Depending on the Cloud Provider). Note: Workflow propeller requires access to the blob storage via a serviceaccount.
Note: Ideally the Blob Storage and the cluster should be in the same Region.
To install tfy-workflow-propeller, follow the following steps:
Create a workspace in your cluster with the name: tfy-workflow-propeller[ Workspaces -> New Workspace]
Open the deployments page and click on New Deployment
Select the workspace as tfy-workflow-propeller and chose the deployment type as Helm as shown in image below
Once you click on next, fill the following values in the form as shown below:
Name: tfy-workflow-propellerHelm repository URL: https://truefoundry.github.io/infra-charts/ChartName: tfy-workflow-propellerVersion:0.0.2Values:<CloudSpecific, refer to the next section of docs>
Now, you need fill the values section of tfy-workflow-propeller. These are described in the next section (depending on the cloud)
Create an S3 bucket (or use an already existing bucket). This should be in same region as your cluster.
Create a Role which has Admin access to the above S3 bucket.
Add policy to ensure that this role can be assumed by the following ServiceAccount: flytepropeller in tfy-workflow-propellernamespace [ No need to create service account, it will be created by the helm chart itself]
Fill the values in the file below and deploy the tfy-workflow-propeller.
global: tenantName:<Enter your tenant name> controlPlaneUrl:<Enter the full control plane url e.g. https://xyz.truefoundry.tech>flyte-core: common: ingress: enabled:false secrets: adminOauthClientCredentials: enabled:true storage:type: s3 limits: maxDownloadMBs:2 bucketName:<Enter your S3 bucket name> connection: region:<Enter your AWS region> auth-type: iam enable-multicontainer:true webhook: enabled:false configmap: k8s: plugins: k8s:default-env-vars:- TFY_INTERNAL_SIGNED_URL_SERVER_HOST:>- http://tfy-signed-url-server.tfy-workflow-propeller.svc.cluster.local:3001 core: propeller: leader-election: enabled:true retry-period: 2s lease-duration: 15s renew-deadline: 10s lock-config-map: name: propeller-leader namespace: tfy-workflow-propeller metadata-prefix: s3://<Enter your S3 bucket name>/tfy-workflow-propeller/metatdata rawoutput-prefix: s3://<Enter your S3 bucket name>/tfy-workflow-propeller/raw_data publish-k8s-events:true admin: admin: Command:- echo-<Enter your cluster token (can be found in tfy-agent helm chart> AuthType: ExternalCommand # endpoint: app.truefoundry.com:443 endpoint:<Enter your control plane host (without https://)>:443 insecure:false AuthorizationHeader: authorization event: rate:500type: admin capacity:100 logger: level:5 show-source:true flyteadmin: enabled:false serviceAccount: alwaysCreate:true datacatalog: enabled:false flyteconsole: enabled:false flytepropeller: enabled:true serviceAccount: # service account is created with name `flytepropeller` create:true annotations: eks.amazonaws.com/role-arn:>-<Enter AWS role ARN with access to storage bucket> workflow_scheduler: enabled:false workflow_notifications: enabled:false cluster_resource_manager: enabled:falsetfySignedURLServer: env: AWS_REGION:<Enter your AWS region here> S3_BUCKET_NAME: s3://<Enter your s3 bucket name here> DEFAULT_CLOUD_PROVIDER: aws enabled:true
Create a GCS bucket (or use an already existing bucket). This should be in same region as your cluster.
Create a Google Cloud Service Account which has Admin access to the above GCS bucket.
Add policy to ensure that this role can be assumed by the following Kubernetes ServiceAccount: flytepropeller in tfy-workflow-propellernamespace [ No need to create service account, it will be created by the helm chart itself]
You may use the script below to do the following:
#!/bin/bashPROJECT=<Enter your GCP PROJECT ID>CLUSTER_NAME=<Enter your GCP Cluster Name>REGION=<Enter your gcp region>BUCKET_NAME=<Enter your GCP bucket name without gs:// prefix>TARGET_NAMESPACE=tfy-workflow-propeller# Clip bucket name to 20 charactersCLIPPED_BUCKET_NAME="${BUCKET_NAME:0:20}"K8S_SA_NAME="flytepropeller"GCP_SA_NAME="$CLIPPED_BUCKET_NAME-flyte-sa"GCP_SA_ID="$GCP_SA_NAME@$PROJECT.iam.gserviceaccount.com"gcloud iam service-accounts create $GCP_SA_NAME--project=$PROJECTgcloud storage buckets add-iam-policy-binding gs://$BUCKET_NAME\--member"serviceAccount:$GCP_SA_ID"\--role"roles/storage.objectAdmin"\--project$PROJECTgcloud iam service-accounts add-iam-policy-binding $GCP_SA_ID\--role"roles/iam.serviceAccountTokenCreator"\--member"serviceAccount:$GCP_SA_ID"\--project$PROJECTgcloud iam service-accounts add-iam-policy-binding $GCP_SA_ID\--role roles/iam.workloadIdentityUser \--member"serviceAccount:$PROJECT.svc.id.goog[$TARGET_NAMESPACE/$K8S_SA_NAME]"\--project$PROJECTgcloud storage buckets add-iam-policy-binding gs://$BUCKET_NAME\--member"serviceAccount:$GCP_SA_ID"\--role"roles/storage.legacyBucketReader"\--project$PROJECT
Fill the values in the file below and deploy the tfy-workflow-propeller.
global: tenantName:<Enter your tenant name> controlPlaneUrl:<Enter the full control plane url e.g. https://xyz.truefoundry.tech>flyte-core: common: ingress: enabled:false secrets: adminOauthClientCredentials: enabled:true storage: gcs: projectId:<Enter GCP Project Id here>type: gcs limits: maxDownloadMBs:2 bucketName:<Enter GCS bucket name> webhook: enabled:false configmap: k8s: plugins: k8s:default-env-vars:- TFY_INTERNAL_SIGNED_URL_SERVER_HOST:>- http://tfy-signed-url-server.tfy-workflow-propeller.svc.cluster.local:3001 core: propeller: leader-election: enabled:true retry-period: 2s lease-duration: 15s renew-deadline: 10s lock-config-map: name: propeller-leader namespace: tfy-workflow-propeller metadata-prefix: gs://<Enter GCS bucket name>/tfy-workflow-propeller/metadata rawoutput-prefix: gs://<Enter GCS bucket name>/tfy-workflow-propeller/raw_data publish-k8s-events:true admin: admin: Command:- echo-<Enter your cluster token (can be found in tfy-agent helm chart> AuthType: ExternalCommand # endpoint: app.truefoundry.com:443 endpoint:<Enter your control plane host (without https://)>:443 insecure:false AuthorizationHeader: authorization event: rate:500type: admin capacity:100 logger: level:5 show-source:true flyteadmin: enabled:false datacatalog: enabled:false flyteconsole: enabled:false flytepropeller: enabled:true serviceAccount: create:true annotations: iam.gke.io/gcp-service-account:>-<Enter your GCP Service Account Email> workflow_scheduler: enabled:false workflow_notifications: enabled:false cluster_resource_manager: enabled:falsetfySignedURLServer: env: GS_BUCKET_NAME:<Enter you you GCS bucket name (without gs:// prefix)> DEFAULT_CLOUD_PROVIDER: gcp enabled:true