TrueFoundry Key Concepts
Key Concepts and Information Architecture to get started
Before we get started with deploying on Truefoundry, it's important to understand a few key concepts. These are primarily Cluster, Workspace, and ML Repo.
Cluster
A cluster (Kubernetes Cluster) is a group of machines that can autoscale up and down. One cluster belongs to only one region, however, there can be multiple clusters in a region. A cluster represents a physical separation between resources. So we can have 2 clusters in Europe West, 1 cluster in us-east and 1 cluster in Asia.
Workspace
A cluster can have multiple workspaces. Each workspace is a logical separation within a cluster so that different teams, applications, and environments can sit within a cluster. Each workspace is essentially a Kubernetes namespace.
For e.g, let's say there are three teams within a company. Team1 manages application1, application2, team2 manages application3, and team3 manages application4 and application5. Each of the applications further has three environments - dev, staging, and production. One way to organize the workspaces can be:
We can also have the dev and prod workspaces in one cluster if you want to structure them like that.
ML Repo
ML Repos concepts in Truefoundry are like Git repos for code versioning, except that ML Repos are meant for versioning of ML models, artifacts, and metadata. You can provide access to certain ML Repos to users, teams, or workspaces. Once a workspace has access to an ML repo, all applications inside the workspace can then read or write to the MLRepo depending on whether the workspace has viewer or editor access. This way, we don't need to inject any keys and this makes the entire system much more secure.
In the case above, workspace1 has access to MLRepo1 and hence team1 can access the assets in MLrepo1. This way, you can also divide datasets and model access across teams and workspaces.
To do anything in Truefoundry, we need an MLRepo and workspace to get started.
Environment
An environment is a tag applied to workspaces to categorize them based on factors like development stage, team ownership, etc. For example: dev, staging, production, frontend, backend, etc. We create three environments initially dev, staging and prod - which you can change at any point of time.
Its essential to assign an environment to the cluster and workspace. A cluster can be associated to multiple environments but a workspace can only be associated to one environment. For each environment, you can also mark the following two metadata:
- IsProduction - This helps the Truefoundry platform to understand its a production environment and it can later give you insights related to access control, availability and security.
- OptimizeFor (Cost / Availability): This helps put more optimization in the cluster to either minimize cost or increase availability.
Creating a Workspace
You can create a workspace from the Workspace tab in the platform. Once you create, you can get the FQN of the workspace from the FQN button.
Getting Workspace FQN
In the Workspace section, locate your workspace and then click the FQN button on the right side to copy the FQN to your clipboard
Creating a ML Repo
Prerequiste - Blob Storage Integration
Before you can create an ML Repo, you'd need to connect one or more Blob Storages (S3, GCS, Azure Blob, MinIO, etc) to store artifacts and models associated with a ML Repo. If this one time setup is already done, you can skip to next section
You can refer to one of the following pages to connect your blob storage to TrueFoundry
You can then create an ML Repo from the ML Repo's tab in the Platform.
Grant access of ML Repo to Workspace
Providing access to a certain ML Repos to a workspace ensures that every application in the Workspace gains access to that ML Repo. You can Grant access to an ML Repo to a Workspace while creating or editing a Workspace.
Creating Environments
To create an environment in TrueFoundry, follow these steps:
How to tag a workspace with an environment
To tag a workspace with an environment, first, the Cluster where the workspace resides needs to have those environments added.
For this, you will have to add all environments relevant to your cluster (one cluster can have multiple environments) in the cluster, using the instructions provided below.
Now all your applications deployed within that specific workspace will have the environment of the workspace show up beside them.
Updated 11 days ago