Truefoundry Docs

In this guide, we’ll deploy a Job to train a machine learning model. The model will learn to predict the species of an iris flower based on its sepal length, sepal width, petal length, and petal width. There are three species: Iris setosa, Iris versicolor, and Iris virginica.

The model has 4 inputs: sepal length, sepal width, petal length, and petal width and ouputs the confidence scores for each species in the following format:

{
  "predictions": [
    {
      "label": "setosa",
      "score": 2.1184377821359003e-16
    },
    {
      "label": "versicolor",
      "score": 3.264647319023382e-9
    },
    {
      "label": "virginica",
      "score": 0.9999999967353526
    }
  ]
}

We’ve already created a Training script that trains a model on the Iris dataset, and you can find the code in our GitHub Repository. Please visit the repository to familiarise yourself with the code you’ll be deploying. The project files are organized as follows:

Directory Structure

.
├── train.py - Contains the training script code.
└── requirements.txt - Contains the list of all dependencies.

Getting Started With Deployment

To deploy a job, you’ll need a workspace. If you don’t have one, you can create it using this guide: Creating a Workspace or seek assistance from your cluster administrator in case you don’t have permission to create a workspace.

In TrueFoundry, you can either deploy code from your Github repository or from your local machine in case the code is not pushed to a Github repository.

Deploy from Github
Deploy from Local Machine

In the above walkthrough, we did the following steps:

Select a workspace to deploy the job. This basically decides which cluster and environment the job will be deployed to.
Select the Job option since this is a job.
We chose the Github option since the code is already pushed to a Github repository.

The key fields that we need to fill up in the deployment form are:

Repo Url: This is the URL of the Github repository that contains the code for the job. For this example, the repo url is https://github.com/truefoundry/getting-started-examples
Path to build context: This is the path to the directory in the Github repository that contains the code for the job. For this example, the path to the build context is ./train-model/
Command: This is the command to run the job. For this example, the command is python train.py

On filling up the form, we can press the Submit button to deploy the job.

View your deployed job

Congratulations! You’ve successfully deployed your job. Once you click Submit, your deployment will be successful in a few seconds, and your job will be displayed as Suspended, indicating that it’s successfully deployed but not running. You can view all the information about your job following the steps below:

Configure Job Trigger

Manual Trigger
Schedule Trigger

You can trigger your job manually by clicking the Run Job button on the dashboard or programatically. You can find more details about how to trigger your job programatically in the dashboard itself by selecting Run Job Programatically.

Terminate your job

To terminate your job, you can click the Terminate button on the dashboard.

FAQ

How to omit certain files from being built and deployed?

It’s possible there are certain files in your repository that you don’t want to package in the docker image like like test files, etc.To exclude specific files from being built and deployed, create a .tfyignore file in the root directory of your project. The .tfyignore file follows the same rules as the .gitignore file.

If your repository already has a .gitignore file, you don’t need to create a .tfyignore file. Truefoundry will automatically detect the files to ignore.

Where is the docker image built?

If you are deploying from a Github repository, the docker image is built in Truefoundry control-plane and pushed to your configured container registry.If you are deploying code from your local machine, the Truefoundry SDK first checks is docker is installed on your local environment. It tries to build using the locally installed Docker, failing which it uses the remote builder on the Truefoundry control-plane. If docker is not installed, the SDK will use the remote builder on the Truefoundry control-plane.

How can I run the job in some interval?

You need to use the cron string format to specify job schedule. The cron expression consists of five fields representing the time to execute a specified command.

* * * * *
| | | | |
| | | | |___ day of week (0-6) (Sunday is 0)
| | | |_____ month (1-12)
| | |_______ day of month (1-31)
| |_________ hour (0-23)
|___________ minute (0-59)

For example,0 11 1 * * represents “first day of every month at 11:00 AM”, or30 9 * * 1 represents “every Monday at 9:30 AM”We can use a site like https://crontab.guru/ to tryout cron expression and get a human-readable description of the same.

What if a job is still running until the next scheduled run?

Say for example, you schedule a daily job which runs at midnight to process and store some data in an S3 bucket but for some reason it is still running even at next midnight. So it is possible that the previous run of the job hasn’t been completed while it is already time for the job to run again as per schedule. What should happen in this case? Should we:

Skip the new run and continue the current run
Stop the ongoing run and start the new one, or
Run both in parallel

On TrueFoundry you can select the behaviour using something called concurrency policy based on your requirements and use case. The possible options are:

Forbid: This is the default. Do not allow concurrent runs.
Allow: Allow jobs to run concurrently.
Replace: Replace the current job with the new one.

Concurrency doesn’t apply to manually triggered jobs. In that case, it always creates a new job run.

Getting Started

Train and Deploy Models

Service Deployment

Job Deployment

LLM Deployment

LLM Finetuning

Workflow Deployment

Async Service Deployment

Volumes

ML Repository

LLM Tracing

Advanced Features

Getting Started

Getting Started With Deployment

View your deployed job

Configure Job Trigger

Terminate your job

FAQ

Getting Started

Train and Deploy Models

Service Deployment

Job Deployment

LLM Deployment

LLM Finetuning

Workflow Deployment

Async Service Deployment

Volumes

ML Repository

LLM Tracing

Advanced Features

​Getting Started With Deployment

​View your deployed job

​Configure Job Trigger

​Terminate your job

​FAQ

Getting Started With Deployment

View your deployed job

Configure Job Trigger

Terminate your job

FAQ