Truefoundry provides a comprehensive model registry to store and manage models. Models are stored in repositories which are backed by S3/GCS/Azure Container/Minio storage on
your cloud account. The key functionalities provided by the model registry are:
Upload Model From UI
You can either upload a model file from disk or import a model saved in your S3 bucket.
Log Model In Code
Install the truefoundry library following the instructions
here
You can log models of all frameworks in the registry. Example of code to log SkLearn and Transformers are provided below:
Custom Framework SkLearn Transformers Pytorch TensorFlow from truefoundry.ml import get_client
client = get_client()
model_version = client.log_model(
ml_repo = "name-of-the-ml-repo" ,
name = "name-for-the-model" ,
model_file_or_folder = "path/to/model/file/or/folder/on/disk" ,
framework = None
)
Any file or folder can be saved as model by passing it in model_file_or_folder
and framework
can be set to None
from truefoundry.ml import get_client
client = get_client()
model_version = client.log_model(
ml_repo = "name-of-the-ml-repo" ,
name = "name-for-the-model" ,
model_file_or_folder = "path/to/model/file/or/folder/on/disk" ,
framework = None
)
Any file or folder can be saved as model by passing it in model_file_or_folder
and framework
can be set to None
from truefoundry.ml import get_client, SklearnFramework, infer_signature
import joblib
import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
X = np.array([[- 1 , - 1 ], [- 2 , - 1 ], [ 1 , 1 ], [ 2 , 1 ]])
y = np.array([ 1 , 1 , 2 , 2 ])
clf = make_pipeline(StandardScaler(), SVC( gamma = 'auto' ))
clf.fit(X, y)
joblib.dump(clf, "sklearn-pipeline.joblib" )
client = get_client()
client.create_ml_repo( # This is only required once
name = "my-classification-project" ,
# This controls which bucket is used.
# You can get this from Integrations > Blob Storage.
storage_integration_fqn = '<storage_integration_fqn>'
)
model_version = client.log_model(
ml_repo = "my-classification-project" ,
name = "my-sklearn-model" ,
description = "A simple sklearn pipeline" ,
model_file_or_folder = "sklearn-pipeline.joblib" ,
framework =SklearnFramework(),
metadata ={ "accuracy" : 0.99 , "f1" : 0.80 },
)
print (model_version.fqn)
from truefoundry.ml import get_client, TransformersFramework
import torch
from transformers import AutoTokenizer, AutoConfig, pipeline, AutoModelForCausalLM
pln = pipeline(
"text-generation" ,
model_file_or_folder = "EleutherAI/pythia-70m" ,
tokenizer = "EleutherAI/pythia-70m" ,
torch_dtype =torch.float16
)
pln.model.save_pretrained( "my-transformers-model" )
pln.tokenizer.save_pretrained( "my-transformers-model" )
client = get_client()
model_version = client.log_model(
ml_repo = "my-llm-project"
name = "my-transformers-model" ,
model_file_or_folder = "my-transformers-model/" ,
framework =TransformersFramework( pipeline_tag = "text-generation" ),
)
print (model_version.fqn)
from truefoundry.ml import get_client
client = get_client()
model_version = client.log_model(
ml_repo = "name-of-the-ml-repo" ,
name = "name-for-the-model" ,
model_file_or_folder = "path/to/model/file/or/folder/on/disk" ,
framework =PyTorchFramework()
)
from truefoundry.ml import get_client
client = get_client()
model_version = client.log_model(
ml_repo = "name-of-the-ml-repo" ,
name = "name-for-the-model" ,
model_file_or_folder = "path/to/model/file/or/folder/on/disk" ,
framework =TensorFlowFramework()
)
For other frameworks, you can use the following classes in truefoundry.ml
FastAIFramework
, GluonFramework
, H2OFramework
, KerasFramework
, LightGBMFramework
, ONNXFramework
, PaddleFramework
, SklearnFramework
, SpaCyFramework
, StatsModelsFramework
, TransformersFramework
, XGBoostFramework
Any subsequent calls to log_model
with the same name
would create a new version of this model - v2
, v3
and so on.
View and manage versions
The logged model can be found in the Models tab. It can also be accessed from inside the MLRepo.
You can view the details of each model version from there on.
Once a model version is created, the model version files are immutable. Only fields like description, framework, metadata can be updated using CLI or UI.
Using the Model in your code
For every model version, you can find the code to download it in your inference service.
Code Snippet to download model