Infra Set-Up for Workflows

Setting up workflows in a Workload Cluster (already connected to truefoundry) requires the following configuration to be done:

Requirements:

  • Cloud Storage Bucket (S3/GCS/AzureBlob)
  • Service Account (or Key based access) with "Admin" permission to the bucket.

Steps

  1. Create an Integration of Blob Storage on the platform ( from Integrations section).
    Ignore this step if you already have the integration added.
  2. In the "Clusters" section, Edit your cluster and Update the "Workflow Storage Integration" with the integration from Step 1.
  3. Install Workflow Propeller in the cluster( DataPlane Component of Flyte ) by following the instruction below (Depending on the Cloud Provider).
  4. Create service accounts (with access to the cloud bucket) in all the workspaceswhere you want the developers to deploy a workflow.

Note: Ideally the Blob Storage and the cluster should be in the same Region.


Setting Up tfy-workflow-propeller

To install tfy-workflow-propeller, follow the following steps:

  1. Create a workspace in your cluster with the name: tfy-workflow-propeller[ Workspaces -> New Workspace]

  2. Open the deployments page and click on New Deployment

  3. Select the workspace as tfy-workflow-propeller and chose the deployment type as Helm as shown in image below

  4. Once you click on next, fill the following values in the form as shown below:

Name: tfy-workflow-propeller
Helm repository URL: https://truefoundry.github.io/infra-charts/
Chart Name: tfy-workflow-propeller
Version: 0.0.1
Values: <Cloud Specific, refer to the next section of docs>

  1. Now, you need fill the values section of tfy-workflow-propeller. These are described in the next section (depending on the cloud)

Values of Workflow Propeller

AWS

Steps:

  1. Create an S3 bucket (or use an already existing bucket). This should be in same region as your cluster.
  2. Create a Role which has Admin access to the above S3 bucket.
  3. Add policy to ensure that this role can be assumed by the following ServiceAccounts:
    1. flytepropeller in tfy-workflow-propellernamespace [ No need to create service account, it will be created by the helm chart itself]
    2. tfy-workflows-sa in all namespaces where you want to allow developer to create workflows. This workflow name
      Note: (ServiceAccount Name: tfy-worklfows-sa, and allowed workspaces needs to be communicated to the developers)
    3. Create ServiceAccounts with names tfy-workflows in all namespaces (as in step ii) with the annotation:
      eks.amazonaws.com/role-arn: <Enter AWS role ARN with access to storage bucket> [Can be done at the end]
  4. Fill the values in the file below and deploy the tfy-workflow-propeller.
common:
  ingress:
    enabled: false
secrets:
  adminOauthClientCredentials:
    enabled: true
storage:
  type: s3
  limits:
    maxDownloadMBs: 2
  bucketName: <Enter your S3 bucket name>
  connection:
    region: <Enter your AWS region>
    auth-type: iam
  enable-multicontainer: true
webhook:
  enabled: false
configmap:
  core:
    propeller:
      leader-election:
        enabled: true
        retry-period: 2s
        lease-duration: 15s
        renew-deadline: 10s
        lock-config-map:
        name: propeller-leader
        namespace: tfy-workflow-propeller
      metadata-prefix: s3://<Enter your S3 Bucket Name>/metatdata
      rawoutput-prefix: s3://<Enter your S3 Bucket Name>/prop
      publish-k8s-events: true
  admin:
    admin:
      Command:
        - echo
        - <Enter your cluster token (can be found in tfy-agent helm chart>
      AuthType: ExternalCommand
      # endpoint: app.truefoundry.com:443
      endpoint: <Enter your control plane host (without https://)>:443
      insecure: false
      AuthorizationHeader: authorization
    event:
      rate: 500
      type: admin
      capacity: 100
  logger:
    level: 5
    show-source: true
flyteadmin:
  enabled: false
  serviceAccount:
    alwaysCreate: true
datacatalog:
  enabled: false
flyteconsole:
  enabled: false
flytepropeller:
  enabled: true
  serviceAccount:
    # service account is created with name `flytepropeller`
    create: true
    annotations:
      eks.amazonaws.com/role-arn: >-
        <Enter AWS role ARN with access to storage bucket>
workflow_scheduler:
  enabled: false
workflow_notifications:
  enabled: false
cluster_resource_manager:
  enabled: false

GCP

Steps:

  1. Create a GCS bucket (or use an already existing bucket). This should be in same region as your cluster.
  2. Create a Google Cloud Service Account which has Admin access to the above GCS bucket.
  3. Add policy to ensure that this role can be assumed by the following Kubernetes ServiceAccounts:
    1. flytepropeller in tfy-workflow-propellernamespace [ No need to create service account, it will be created by the helm chart itself]
    2. tfy-workflows-sa in all namespaces where you want to allow developer to create workflows. This workflow name
      Note: (ServiceAccount Name: tfy-worklfows-sa, and allowed workspaces needs to be communicated to the developers)
    3. Create ServiceAccounts with names tfy-workflows in all namespaces (as in step ii) with the annotation:
      iam.gke.io/gcp-service-account: <Enter Google Cloud Service Account Email> [Can be done at the end]
  4. Fill the values in the file below and deploy the tfy-workflow-propeller.
flyte-core:
  common:
    ingress:
      enabled: false
  secrets:
    adminOauthClientCredentials:
      enabled: true
  storage:
    gcs:
      projectId: <Enter GCP Project Id here>
    type: gcs
    limits:
      maxDownloadMBs: 2
    bucketName: <Enter GCS bucket name>
  webhook:
    enabled: false
  configmap:
    core:
      propeller:
        leader-election:
          enabled: true
          retry-period: 2s
          lease-duration: 15s
          renew-deadline: 10s
          lock-config-map:
            name: propeller-leader
            namespace: tfy-workflow-propeller
        metadata-prefix: gs://<Enter GCS bucket name>/metadata
        rawoutput-prefix: gs://<Enter GCS bucket name>/raw_data
        publish-k8s-events: true
    admin:
      admin:
        Command:
          - echo
          - <Enter your cluster token (can be found in tfy-agent helm chart>
        AuthType: ExternalCommand
        # endpoint: app.truefoundry.com:443
        endpoint: <Enter your control plane host (without https://)>:443
        insecure: false
        AuthorizationHeader: authorization
      event:
        rate: 500
        type: admin
        capacity: 100
    logger:
      level: 5
      show-source: true
  flyteadmin:
    enabled: false
  datacatalog:
    enabled: false
  flyteconsole:
    enabled: false
  flytepropeller:
    enabled: true
    serviceAccount:
      create: true
      annotations:
        iam.gke.io/gcp-service-account: >-
          <Enter your GCP Service Account Email>
  workflow_scheduler:
    enabled: false
  workflow_notifications:
    enabled: false
  cluster_resource_manager:
    enabled: false


Azure

  1. Create a Azure Blob Storage Account and Container (or use an already existing container). This should be in same region as your cluster.
  2. Create a Storage Account Key (Can be found in Connection String) with Admin access to the storage account.
  3. Fill the values in the file below and deploy the tfy-workflow-propeller.
flyte-core:
  common:
    ingress:
      enabled: false
  secrets:
    adminOauthClientCredentials:
      enabled: true
  storage:
    type: custom
    custom:
      stow:
        kind: azure
        config:
          key: <Enter your Azure Storage Account Key>
          account: <Enter your Storage Account Name>
      type: stow
      container: <Enter name of the Storage Container>
      connection: {}
      enable-multicontainer: true
    limits:
      maxDownloadMBs: 2
  webhook:
    enabled: false
  configmap:
    k8s:
      plugins:
        k8s:
          default-env-vars:
            - AZURE_STORAGE_ACCOUNT_NAME: <Enter your Storage Account Name>
            - AZURE_STORAGE_ACCOUNT_KEY: <Enter your Azure Storage Account Key>
    core:
      propeller:
        leader-election:
          enabled: true
          retry-period: 2s
          lease-duration: 15s
          renew-deadline: 10s
          lock-config-map:
            name: propeller-leader
            namespace: tfy-flyte-propeller
        metadata-prefix: abfs://<Enter name of the Storage Container>/workflows/metadata
        rawoutput-prefix: abfs://<Enter name of the Storage Container>/workflows/raw_data
        publish-k8s-events: true
    admin:
      admin:
        Command:
          - echo
          - <Enter your cluster token (can be found in tfy-agent helm chart>
        AuthType: ExternalCommand
        # endpoint: app.truefoundry.com:443
        endpoint: <Enter your control plane host (without https://)>:443
        insecure: false
        AuthorizationHeader: authorization
      event:
        rate: 500
        type: admin
        capacity: 100
    logger:
      level: 5
      show-source: true
  flyteadmin:
    enabled: false
  datacatalog:
    enabled: false
  flyteconsole:
    enabled: false
  flytepropeller:
    enabled: true
    serviceAccount:
      create: false
  workflow_scheduler:
    enabled: false
  workflow_notifications:
    enabled: false
  cluster_resource_manager:
    enabled: false