Log and Get Models
Log Models and Download Models
Model comprises of model file/folder and some metadata. Each Model can have multiple versions. In essence they are just Artifacts with special type model
Log Model Version
You can automatically save and version model files/folder using the log_model
method.
The basic usage looks like follows
from truefoundry.ml import get_client, ModelFramework
client = get_client()
model_version = client.log_model(
ml_repo="name-of-the-ml-repo",
name="name-for-the-model",
model_file_or_folder="path/to/model/file/or/folder/on/disk",
framework=<None or ModelFramework object>
)
Framework Agnostic
Any file or folder can be saved as model by passing it in
model_file_or_folder
andframework
can be set toNone
.
This is an example of storing an sklearn
model. To log a model we start a run and then give our model a name
and pass in the model saved on disk and the framework
name.
from truefoundry.ml import get_client, SklearnFramework, infer_signature
import joblib
import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
y = np.array([1, 1, 2, 2])
clf = make_pipeline(StandardScaler(), SVC(gamma='auto'))
clf.fit(X, y)
joblib.dump(clf, "sklearn-pipeline.joblib")
client = get_client()
client.create_ml_repo( # This is only required once
name="my-classification-project",
# This controls which bucket is used.
# You can get this from Integrations > Blob Storage.
storage_integration_fqn='<storage_integration_fqn>'
)
model_version = client.log_model(
ml_repo="my-classification-project",
name="my-sklearn-model",
description="A simple sklearn pipeline",
model_file_or_folder="sklearn-pipeline.joblib",
framework=SklearnFramework(),
metadata={"accuracy": 0.99, "f1": 0.80},
)
print(model_version.fqn)
from truefoundry.ml import get_client, ModelFramework
import torch
from transformers import AutoTokenizer, AutoConfig, pipeline, AutoModelForCausalLM
pln = pipeline(
"text-generation",
model_file_or_folder="EleutherAI/pythia-70m",
tokenizer="EleutherAI/pythia-70m",
torch_dtype=torch.float16
)
pln.model.save_pretrained("my-transformers-model")
pln.tokenizer.save_pretrained("my-transformers-model")
client = get_client()
model_version = client.log_model(
ml_repo="my-llm-project"
name="my-transformers-model",
model_file_or_folder="my-transformers-model/",
framework=ModelFramework.TRANSFORMERS
)
print(model_version.fqn)
This will create a new model my-sklearn-model
under the ml_repo and the first version v1
for my-sklearn-model
.
Once created the model version files are immutable, only fields like description, framework, metadata can be updated using CLI or UI.
Once created, a model version has a fqn
(fully qualified name) which can be used to retrieve the model later - E.g. model:truefoundry/my-classification-project/my-sklearn-model:1
Any subsequent calls to log_model
with the same name
would create a new version of this model - v2
, v3
and so on.
The logged model can be found in the dashboard in the Models
tab under your ml_repo.
You can view the details of each model version from there on.
Logging Model Version without a Run
It is possible to also log model without creating a run at all. See MlFoundry.log_model
Get Model Version and Download
You can first get the model using the fqn
and then download the logged model using the fqn
and then use the download()
function. From here on you can access the files at download_info.download_dir
import os
import tempfile
import joblib
from truefoundry.ml import get_client
client = get_client()
model_version = client.get_model_version_by_fqn(
fqn="model:truefoundry/my-classification-project/my-sklearn-model:1"
)
# Download the model to disk
temp = tempfile.TemporaryDirectory()
download_info = model_version.download(path=temp.name)
print(download_info.model_dir)
# Deserialize and Load
model = joblib.load(
os.path.join(download_info.model_dir, "sklearn-pipeline.joblib")
)
import torch
from transformers import pipeline
from truefoundry.ml import get_client
client = get_client()
model_version = client.get_model_version_by_fqn(
fqn="model:truefoundry/my-llm-project/my-transformers-model:1"
)
# Download the model to disk
temp = tempfile.TemporaryDirectory()
download_info = model_version.download(path=temp.name)
print(download_info.model_dir)
# Deserialize and Load
pln = pipeline("text-generation", model=download_info.model_dir, torch_dtype=torch.float16)s
FAQs
What are the frameworks supported by the log_model
method?
log_model
method?Following values are supported
- sklearn
- tensorflow
- pytorch
- keras
- xgboost
- lightgbm
- fastai
- h2o
- spacy
- statsmodels
- gluon
- paddle
- transfomers
These options are also available as a enum - truefoundry.ml.ModelFramework
Update Model Version
You may want to update fields like description, framework, metadata on an existing model version.
You can do so with the .update()
call on the Model Version instance.
E.g.
from truefoundry.ml import get_client, SklearnFramework
client = get_client()
model_version = client.get_model_version_by_fqn(
"model:truefoundry/my-classification-project/my-sklearn-model:1"
)
model_version.description = "This is my updated description"
model_version.metadata = {"accuracy": 0.98, "f1": 0.85}
model_version.framework = SklearnFramework(
model_filepath="sklearn-pipeline.joblib",
serialization_format="joblib"
)
# Updates the model fields for existing model version.
model_version.update()
Updated about 1 month ago