module
TrueFoundry.ml
Global Variables
- TRACKING_HOST_GLOBAL
function
get_client
Initializes and returns the truefoundry client.
Args:
disable_analytics
(bool, optional): To turn off usage analytics collection, passTrue
. By default, this is set toFalse
.
MLFoundry
: Instance ofMLFoundry
class which represents arun
.
- Get client
class
MlFoundry
MlFoundry.
function
create_data_directory
Create DataDirectory to Upload the files
Args:
ml_repo
(str): Name of the ML Repo in which you want to create data_directoryname
(str): Name of the DataDirectory to be created.description
(str): Description of the Datsetmetadata
(Dict<str>
: Any): Metadata about the data_directory in Dictionary form.
DataDirectory
: An instance of DataDirectory class
function
create_ml_repo
Creates an ML Repository.
Args:
name
(str, optional): The name of the Repository you want to create. if not given, it creates a name by itself.storage_integration_fqn
(str): The storage integration FQN to use for the experiment for saving artifacts.
- Create Repository
function
create_run
Initialize a run
.
In a machine learning experiment run
represents a single experiment conducted under a project.
Args:
ml_repo
(str): The name of the project under which the run will be created. ml_repo should only contain alphanumerics (a-z,A-Z,0-9) or hyphen (-). The user must haveADMIN
orWRITE
access to this project.run_name
(Optional[str], optional): The name of the run. If not passed, a randomly generated name is assigned to the run. Under a project, all runs should have a unique name. If the passedrun_name
is already used under a project, therun_name
will be de-duplicated by adding a suffix. run name should only contain alphanumerics (a-z,A-Z,0-9) or hyphen (-).tags
(Optional[Dict[str, Any]], optional): Optional tags to attach with this run. Tags are key-value pairs.
MlFoundryRun
: An instance of MlFoundryRun class which represents a run
- Create a run under current user.
- Creating a run using context manager.
- Create a run in a project owned by a different user.
function
get_all_runs
Returns all the run name and id present under a project.
The user must have READ
access to the project.
Args:
ml_repo
(str): Name of the project.
pd.DataFrame
: dataframe with two columns- run_id and run_name
- get all the runs from a ml_repo
function
get_artifact_version
Get the model version to download contents or load it in memory
Args:
ml_repo
(str): ML Repo to which artifact is loggedartifact_name
(str): Artifact Nameartifact_type
(str): The type of artifact to fetch (acceptable values: “artifact”, “model”, “plot”, “image”)version
(str | int): Artifact Version to fetch (default is the latest version)
ArtifactVersion
: An ArtifactVersion instance of the artifact
function
get_artifact_version_by_fqn
Get the artifact version to download contents
Args:
fqn
(str): Fully qualified name of the artifact version.
ArtifactVersion
: An ArtifactVersion instance of the artifact
function
get_data_directory
Get an existing data_directory
by name
.
Args:
ml_repo
(str): name of an the project of which the data-directory is part of.name
(str): the name of the data-directory
DataDirectory
: An instance of DataDirectory class
function
get_data_directory_by_fqn
Get the DataDirectory by DataDirectory FQN
Args:
fqn
(str): Fully qualified name of the artifact version.
DataDirectory
: An instance of DataDirectory class
function
get_data_directory_by_id
Get the DataDirectory From the DataDirectory ID
Args:
id
(uuid.UUID): Id of the data_directory.
DataDirectory
: An instance of DataDirectory class
function
get_model_version
Get the model version to download contents or load it in memory
Args:
ml_repo
(str): ML Repo to which model is loggedname
(str): Model Nameversion
(str | int): Model Version to fetch (default is the latest version)
ModelVersion
: A ModelVersion instance of the model
- Sklearn
- Huggingface Transformers
function
get_model_version_by_fqn
Get the model version to download contents or load it in memory
Args:
fqn
(str): Fully qualified name of the model version.
ModelVersion
: A ModelVersion instance of the model
- Sklearn
- Huggingface Transformers
function
get_run_by_fqn
Get an existing run
by fqn
.
fqn
stands for Fully Qualified Name. A run fqn
has the following pattern: tenant_name/ml_repo/run_name
If a run svm
under the project cat-classifier
in truefoundry
tenant, the fqn
will be truefoundry/cat-classifier/svm
.
Args:
run_fqn
(str):fqn
of an existing run.
MlFoundryRun
: An instance of MlFoundryRun class which represents a run
- get run by run fqn
function
get_run_by_id
Get an existing run
by the run_id
.
Args:
run_id
(str): run_id or fqn of an existingrun
.
MlFoundryRun
: An instance of MlFoundryRun class which represents a run
- Get run by the run id
function
get_run_by_name
Get an existing run
by run_name
.
Args:
ml_repo
(str): name of the ml_repo of which the run is part of.run_name
(str): the name of the run required
MlFoundryRun
: An instance of MlFoundryRun class which represents a run
- get run by name
function
get_tracking_uri
Get the current tracking URI.
Returns:The tracking URI. Examples:
function
list_artifact_versions
Get all the version of na artifact to download contents or load them in memory
Args:
ml_repo
(str): Repository in which the model is stored.name
(str): Name of the artifact whose version is requiredartifact_type
(ArtifactType): Type of artifact you want for example model, image, etc.
Iterator[ArtifactVersion]
: An iterator that yields non-deleted artifact versions of an artifact under a given ml_repo, sorted in reverse by version number
function
list_artifact_versions_by_fqn
List versions for a given artifact
Args:
artifact_fqn
: FQN of the Artifact to list versions for.An artifact_fqn looks like
{artifact_type}`: {org}/{user}/{project}/{artifact_name}`or
{artifact_type}`: {user}/{project}/{artifact_name}`
Iterator[ArtifactVersion]
: An iterator that yields non-deleted artifact versions under the given artifact_fqn, sorted in reverse by version number
ArtifactVersion
: An instance ofmlfoundry.ArtifactVersion
function
list_data_directories
Get the list of DataDirectory in a ml_repo
Args:
ml_repo
(str): Name of the ML Repositorymax_results
(int): Maximum number of Data Directory to listoffset
(int): Skip these number of instance of DataDirectory and then give the result from these number onwards
Iterator[DataDirectory]
: An iterator that yields DataDirectory instances
function
list_ml_repos
Returns a list of names of ML Repos accessible by the current user.
Returns:
List[str]
: A list of names of ML Repos
function
list_model_versions
Get all the version of a model to download contents or load them in memory
Args:
ml_repo
(str): Repository in which the model is stored.name
(str): Name of the model whose version is required
Iterator[ModelVersion]
: An iterator that yields non-deleted model versions of a model under a given ml_repo, sorted in reverse by version number
function
list_model_versions_by_fqn
List versions for a given model
Args:
model_fqn
: FQN of the Model to list versions for.A model_fqn looks like
model`: {org}/{user}/{project}/{artifact_name}`or
model`: {user}/{project}/{artifact_name}`
Iterator[ModelVersion]
: An iterator that yields non-deleted model versions under the given model_fqn, sorted in reverse by version number
ModelVersion
: An instance ofmlfoundry.ModelVersion
function
log_artifact
Logs an artifact for the current ml_repo
.
An artifact
is a list of local files and directories. This function packs the mentioned files and directories in artifact_paths
and uploads them to remote storage linked to the ml_repo
Args:
ml_repo
(str): Name of the ML Repo to which an artifact is to be logged.name
(str): Name of the Artifact. If an artifact with this name already exists under the current ml_repo, the logged artifact will be added as a new version under thatname
. If no artifact exist with the givenname
, the given artifact will be logged as version 1.artifact_paths
(List[truefoundry.ml.ArtifactPath], optional): A list of pairs of (source path, destination path) to add files and folders to the artifact version contents. The first member of the pair should be a file or directory path and the second member should be the path inside the artifact contents to upload to.
description
(Optional[str], optional): arbitrary text upto 1024 characters to store as description. This field can be updated at any time after logging. Defaults toNone
metadata
(Optional[Dict[str, Any]], optional): arbitrary json serializable dictionary to store metadata. For example, you can use this to store metrics, params, notes. This field can be updated at any time after logging. Defaults toNone
truefoundry.ml.ArtifactVersion
: an instance ofArtifactVersion
that can be used to download the files, or update attributes like description, metadata.
function
log_model
Serialize and log a versioned model under the current ml_repo. Each logged model generates a new version associated with the given name
and linked to the current run. Multiple versions of the model can be logged as separate versions under the same name
.
Args:
ml_repo
(str): Name of the ML Repo to which an artifact is to be logged.name
(str): Name of the model. If a model with this name already exists under the current ML Repo, the logged model will be added as a new version under thatname
. If no models exist with the givenname
, the given model will be logged as version 1.model_file_or_folder
(str): Path to either a single file or a folder containing model files. This folder is usually created using serialization methods of libraries or frameworks e.g.joblib.dump
,model.save_pretrained(...)
,torch.save(...)
,model.save(...)
framework
(Union[enums.ModelFramework, str]): Model Framework. Ex:- pytorch, sklearn, tensorflow etc. The full list of supported frameworks can be found inmlfoundry.enums.ModelFramework
. Can also beNone
whenmodel
isNone
.description
(Optional[str], optional): arbitrary text upto 1024 characters to store as description. This field can be updated at any time after logging. Defaults toNone
metadata
(Optional[Dict[str, Any]], optional): arbitrary json serializable dictionary to store metadata. For example, you can use this to store metrics, params, notes. This field can be updated at any time after logging. Defaults toNone
truefoundry.ml.ModelVersion
: an instance ofModelVersion
that can be used to download the files, load the model, or update attributes like description, metadata, schema.
- Sklearn
- Huggingface Transformers
class
ArtifactVersion
Represents a version of an artifact in MLFoundry.
Properties
fqn
(str): Fully qualified name of the artifact versionname
(str): Name of the artifactversion
(int): Version number of the artifactdescription
(str): Description of the artifact versionmetadata
(Dict[str, Any]): Metadata associated with the artifact versioncreated_at
(datetime): Timestamp when the artifact version was createdupdated_at
(datetime): Timestamp when the artifact version was last updated
Methods
download(path: str) -> DownloadInfo
: Download the artifact to a local pathload() -> Any
: Load the artifact into memory (for supported types)
class
ModelVersion
Represents a version of a model in MLFoundry.
Properties
fqn
(str): Fully qualified name of the model versionname
(str): Name of the modelversion
(int): Version number of the modeldescription
(str): Description of the model versionmetadata
(Dict[str, Any]): Metadata associated with the model versionframework
(str): Framework used for the model (e.g., sklearn, pytorch)created_at
(datetime): Timestamp when the model version was createdupdated_at
(datetime): Timestamp when the model version was last updated
Methods
download(path: str) -> DownloadInfo
: Download the model to a local pathload() -> Any
: Load the model into memory (for supported frameworks)
class
DataDirectory
Represents a data directory in MLFoundry for storing files and datasets.
Properties
fqn
(str): Fully qualified name of the data directoryname
(str): Name of the data directorydescription
(str): Description of the data directorymetadata
(Dict[str, Any]): Metadata associated with the data directorycreated_at
(datetime): Timestamp when the data directory was createdupdated_at
(datetime): Timestamp when the data directory was last updated
Methods
add_files(artifact_paths: List[DataDirectoryPath])
: Add files to the data directorylist_files() -> Iterator[DataDirectoryFile]
: List all files in the data directorydownload(path: str) -> DownloadInfo
: Download the data directory to a local path
class
MlFoundryRun
Represents a run in MLFoundry for tracking experiments.
Properties
fqn
(str): Fully qualified name of the runname
(str): Name of the runrun_id
(str): Unique identifier for the runstatus
(str): Current status of the run (RUNNING, FINISHED, FAILED)start_time
(datetime): Timestamp when the run startedend_time
(datetime): Timestamp when the run ended (if finished)tags
(Dict[str, Any]): Tags associated with the run
Methods
log_metrics(metric_dict: Dict[str, float])
: Log metrics for the runlog_params(params_dict: Dict[str, Any])
: Log parameters for the runlog_artifact(name: str, artifact_paths: List[ArtifactPath])
: Log an artifactlog_model(name: str, model_file_or_folder: str, framework: str)
: Log a modelend()
: Mark the run as finished
function
search_runs
The user must have READ
access to the project. Returns an iterator that returns a MLFoundryRun on each next call. All the runs under a project which matches the filter string and the run_view_type are returned.
Args:
-
ml_repo
(str): Name of the project. filter_string (str, optional): Filter query string, defaults to searching all runs. Identifier required in the LHS of a search expression. -
Signifies an entity to compare against. An identifier has two parts separated by a period
: the type of the entity and the name of the entity. The type of the entity is metrics, params, attributes, or tags. The entity name can contain alphanumeric characters and special characters. -
You can search using two run attributes
: status and artifact_uri. Both attributes have string values. When a metric, parameter, or tag name contains a special character like hyphen, space, period, and so on, enclose the entity name in double quotes or backticks, params.”model-type” or params.model-type
-
run_view_type
(str, optional): one of the following values “ACTIVE_ONLY”, “DELETED_ONLY”, or “ALL” runs. order_by (List[str], optional): List of columns to order by (e.g., “metrics.rmse”). Currently supported values are metric.key, parameter.key, tag.key, attribute.key. Theorder_by
column can contain an optionalDESC
orASC
value. The default isASC
. The default ordering is to sort bystart_time DESC
. -
job_run_name
(str): Name of the job which are associated with the runs to get that runs -
max_results
(int): max_results on the total numbers of run yielded through filter
Iterator[MlFoundryRun]
: An iterator that yields MLFoundryRun instances matching the search query