Data Directory

module dataset.py


class DataDirectory

property created_at

Get the time at which DataDirectory was created


property created_by

Get the information about who created the DataDirectory


property description

Get description of the DataDirectory


property fqn

Get fqn of the DataDirectory


property metadata

Get metadata for the current DataDirectory


property name

Get the name of the DataDirectory


property storage_root

Get storage_root of the DataDirectory


property updated_at

Get the information about the when the DataDirectory was updated


function add_files

Logs File in the DataDirectory.

Args:

  • file_paths (List[mlfoundry.DataDirectoryPath], optional): A list of pairs of (source path, destination path) to add files and folders to the DataDirectory contents. The first member of the pair should be a file or directory path and the second member should be the path inside the artifact contents to upload to.
         E.g. >>> data_directory.add_files(
              ...     file_paths=[
                         mlfoundry.DataDirectoryPath("foo.txt", "foo/bar/foo.txt"),
                         mlfoundry.DataDirectoryPath("tokenizer/", "foo/tokenizer/"),
                         mlfoundry.DataDirectoryPath('bar.text'),
                         ('bar.txt', ),
                         ('foo.txt', 'a/foo.txt')
                      ]
              ... )
         would result in
         .
         └── foo/
             ├── bar/
             │   └── foo.txt
             └── tokenizer/
                 └── # contents of tokenizer/ directory will be uploaded here

Examples:

import os
import mlfoundry

with open("artifact.txt", "w") as f:
     f.write("hello-world")

client = mlfoundry.get_client()
data_directory = client.get_data_directory_by_fqn(fqn="<data_directory-fqn>")

data_directory.add_files(
     file_paths=[mlfoundry.DataDirectoryPath('artifact.txt', 'a/b/')]
)


function download

Download an file or directory to a local directory if applicable, and return a local path for it.

Args:

  • download_path: Relative source path to the desired files.
  • path: Absolute path of the local filesystem destination directory to which to download the specified files. This directory must already exist. If unspecified, the files will either be downloaded to a new uniquely-named directory.
  • overwrite: if to overwrite the files at/inside dst_path if they exist

Returns:

  • str: Absolute path of the local filesystem location containing the desired files or folder.

Examples:

import mlfoundry

client = mlfoundry.get_client()
data_directory = client.get_data_directory_by_fqn(fqn="<your-artifact-fqn>")
data_directory.download(download_path="<your-desired-download-path>")


classmethod from_fqn

Get the DataDirectory to download contents or load them in memory

Args:

  • fqn (str): Fully qualified name of the data directory.

Returns:

  • DataDirectory: An DataDirectory instance of the artifact

Examples:

import mlfoundry

client = mlfoundry.get_client()
data_directory = mlfoundry.DataDirectory.from_fqn(fqn="<data_directory-fqn>")


function list_files

List all the files and folders in the data_directory.

Args:

  • path: The path of directory in data_directory, from which the files are to be listed.

Returns:

  • str: Absolute path of the local filesystem location containing the desired files or folder.

Examples:

import mlfoundry

client = mlfoundry.get_client()
data_directory = client.get_data_directory_by_fqn(fqn="<your-artifact-fqn>")
files = data_directory.list_files()
for file in files:
     print(file.path)


function update

Updates the current instance of the DataDirectory.

Examples:

import mlfoundry

client = mlfoundry.get_client()
data_directory = client.get_data_directory_by_fqn(fqn="<your-data_directory-fqn>")
data_directory.description = 'This is the new description'
data_directory.update()


class DataDirectoryPath

DataDirectoryPath(src, dest)