Truefoundry can be installed onto your own AWS, GCP or Azure account. Truefoundry comprises of an architecture with a central control plane to which you can connect multiple Kubernetes clusters. The Truefoundry architecture looks like following:

Truefoundry comprises of a control plane that helps manage mulitple Kubernetes clusters. These clusters can be in any cloud provider, across VPCs, etc. The datascientists and developers interact with a web UI hosted at the control plane that manages all the deployments and machine learning metadata. There is a central authentication server hosted on Truefoundry cloud to which the control plane communicates the users' login and usage information.

Truefoundry can be deployed on your own cloud in two different ways:

Truefoundry Control Plane on your own cloud

In this case the Truefoundry control plane is also deployed on the customer's cloud. All the compute, data and the UI lives on your own cloud account. The authentication server stays in the Truefoundry cloud and the controlplane communicates the license and authentication information to the auth server. No user and customer data flows out of the cloud in this case.

Pros:

Complete protection of data and nothing flows out of your cloud.

Cons:

You incur the cost of hosting the control plane and this will require manual upgrades to the control plane from your end.

Use Hosted Truefoundry ControlPlane

In this case, the control plane is hosted by the Truefoundry and you connect your own Kubernetes clusters to the control plane. Upgrades are managed by the Truefoundry team and no other infrastructure is needed from your end to onboard onto Truefoundry.

Components

Truefoundry comprises of the following functionalities:

Experiment Tracking and ML Metadata Store (MLFoundry)

MlFoundry is used to log models, datasets, metrics, params related to model training and helps maintain lineage between the runs and artifacts.

ML Training and Inference Deployment (ServiceFoundry)

ServiceFoundry helps data scientists and machine learning engineers to deploy jobs and services on Kubernetes. It also provides an internal developer
platform to view all the deployed services along with the cost and manage permission and access control.

Truefoundry Dashboard:

This is a dashboard which helps you view the data from MlFoundry, Servicefoundry and ML-Monitoring in one place.

Tfy-agent

This component sits on each of the workload cluster and helps communicate with the central control plane.

Client Libraries:

We have two client libraries for datascientists, engineers and devops to interact with the services mentioned above. The two libraries are:

  1. mlfoundry (pip install mlfoundry)
  2. servicefoundry (pip install servicefoundry)

All the above functionalities can be installed independently or together depending on your requirements.