Log and Get Models
Log Models and Download Models
Model comprises of model file/folder and some metadata. Each Model can have multiple versions. In essence they are just Artifacts with special type model
Log Model Version
You can automatically save and version model files/folder using the log_model
method.
The basic usage looks like follows
from truefoundry.ml import get_client, ModelFramework
client = get_client()
run = client.create_run(...)
model_version = run.log_model(
name="name-for-the-model",
model_file_or_folder="path/to/model/file/or/folder/on/disk",
framework=<None or ModelFramework member>
)
Framework Agnostic
Any file or folder can be saved as model by passing it in
model_file_or_folder
andframework
can be set toNone
.
This is an example of storing an sklearn
model. To log a model we start a run and then give our model a name
and pass in the model saved on disk and the framework
name.
from truefoundry.ml import get_client, ModelFramework
import torch
from transformers import AutoTokenizer, AutoConfig, pipeline, AutoModelForCausalLM
pln = pipeline(
"text-generation",
model_file_or_folder="EleutherAI/pythia-70m",
tokenizer="EleutherAI/pythia-70m",
torch_dtype=torch.float16
)
pln.model.save_pretrained("my-transformers-model")
pln.tokenizer.save_pretrained("my-transformers-model")
client = get_client()
run = client.create_run(
ml_repo="my-llm-project"
)
model_version = run.log_model(
name="my-transformers-model",
model_file_or_folder="my-transformers-model/",
framework=ModelFramework.TRANSFORMERS
)
print(model_version.fqn)
from truefoundry.ml import get_client, ModelFramework
import joblib
import numpy as np
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
y = np.array([1, 1, 2, 2])
clf = make_pipeline(StandardScaler(), SVC(gamma='auto'))
clf.fit(X, y)
joblib.dump(clf, "sklearn-pipeline.joblib")
client = get_client()
client.create_ml_repo( # This is only required once
ml_repo="my-classification-project",
# This controls which bucket is used.
# You can get this from Integrations > Blob Storage. `None` picks the default
storage_integration_fqn=None
)
run = client.create_run(
ml_repo="my-classification-project"
)
model_version = run.log_model( # You can also directly call client.log_model
name="my-sklearn-model",
model_file_or_folder="sklearn-pipeline.joblib",
framework=ModelFramework.SKLEARN,
metadata={"accuracy": 0.99, "f1": 0.80},
step=1, # step number, useful when using iterative algorithms like SGD
)
print(model_version.fqn)
This will create a new model iris-demo
under the ml_repo and the first version v1
for iris-classifier
. Once created the model version is immutable.
Once created, a model version has a fqn
(fully qualified name) which can be used to retrieve the model later - E.g. model:truefoundry/my-classification-project/my-sklearn-model:1
Any subsequent calls to log_model
with the same name
would create a new version of this model - v2
, v3
and so on.
The logged model can be found in the dashboard in the Models
tab under your ml_repo.
You can view the details of each model version from there on.
Logging Model Version without a Run
It is possible to also log model without creating a run at all. See MlFoundry.log_model
Get Model Version and Download
You can first get the model using the fqn
and then download the logged model using the fqn
and then use the download()
function. From here on you can access the files at download_info.download_dir
import torch
from transformers import pipeline
from truefoundry.ml import get_client
client = get_client()
model_version = client.get_model_version_by_fqn(
fqn="model:truefoundry/my-llm-project/my-transformers-model:1"
)
# Download the model to disk
temp = tempfile.TemporaryDirectory()
download_info = model_version.download(path=temp.name)
print(download_info.model_dir)
# Deserialize and Load
pln = pipeline("text-generation", model=download_info.model_dir, torch_dtype=torch.float16)
import tempfile
import joblib
from truefoundry.ml import get_client
client = get_client()
model_version = client.get_model_version_by_fqn(
fqn="model:truefoundry/my-classification-project/my-sklearn-model:1"
)
# Download the model to disk
temp = tempfile.TemporaryDirectory()
download_info = model_version.download(path=temp.name)
print(download_info.model_dir, download_info.model_filename)
# Deserialize and Load
model = joblib.load(
os.path.join(download_info.model_dir, download_info.model_filename)
)
FAQs
What are the frameworks supported by the log_model
method?
log_model
method?Following values are supported
- sklearn
- tensorflow
- pytorch
- keras
- xgboost
- lightgbm
- fastai
- h2o
- spacy
- statsmodels
- gluon
- paddle
- transfomers
These options are also available as a enum - truefoundry.ml.ModelFramework
Updated about 2 months ago