Getting Started
In this guide, we’ll deploy a Job to train a machine learning model. The model will learn to predict the species of an iris flower based on its sepal length, sepal width, petal length, and petal width. There are three species: Iris setosa, Iris versicolor, and Iris virginica.
The model has 4 inputs: sepal length, sepal width, petal length, and petal width and ouputs the confidence scores for each species in the following format:
We’ve already created a Training script that trains a model on the Iris dataset, and you can find the code in our GitHub Repository.
Please visit the repository to familiarise yourself with the code you’ll be deploying. The project files are organized as follows:
Getting Started With Deployment
To deploy a job, you’ll need a workspace. If you don’t have one, you can create it using this guide: Creating a Workspace or seek assistance from your cluster administrator in case you don’t have permission to create a workspace.
In TrueFoundry, you can either deploy code from your Github repository or from your local machine in case the code is not pushed to a Github repository.
In the above walkthrough, we did the following steps:
- Select a workspace to deploy the job. This basically decides which cluster and environment the job will be deployed to.
- Select the Job option since this is a job.
- We chose the Github option since the code is already pushed to a Github repository.
The key fields that we need to fill up in the deployment form are:
Repo Url
: This is the URL of the Github repository that contains the code for the job. For this example, the repo url is https://github.com/truefoundry/getting-started-examplesPath to build context
: This is the path to the directory in the Github repository that contains the code for the job. For this example, the path to the build context is./train-model/
Command
: This is the command to run the job. For this example, the command ispython train.py
On filling up the form, we can press the Submit button to deploy the job.
In the above walkthrough, we did the following steps:
- Select a workspace to deploy the job. This basically decides which cluster and environment the job will be deployed to.
- Select the Job option since this is a job.
- We chose the Github option since the code is already pushed to a Github repository.
The key fields that we need to fill up in the deployment form are:
Repo Url
: This is the URL of the Github repository that contains the code for the job. For this example, the repo url is https://github.com/truefoundry/getting-started-examplesPath to build context
: This is the path to the directory in the Github repository that contains the code for the job. For this example, the path to the build context is./train-model/
Command
: This is the command to run the job. For this example, the command ispython train.py
On filling up the form, we can press the Submit button to deploy the job.
Clone the github repository and navigate to the train-model
directory.
To deploy from your local machine, you can follow the steps on the UI to get the deployment script.
Once you reach the last step, you will be able to download a deploy.py script that contains the configuration for the deployment.
The deploy.py should be placed in the root of your project.
The deploy.py has a field called project_root_path
in the build_source section.
The project_root_path is considered relative to where the command python deploy.py
is executed from.
The build_context_path is relative to the project_root_path.
All the code in the build_context_path
will be packaged into a docker image and deployed.
It’s usually recommended to use ”./” as the project_root_path, put the deploy.py
in that path and execute the command python deploy.py
from the project_root_path directory.
The directory structure will then appear as follows:
To deploy, execute the command:
An explanation of the deploy.py file
An explanation of the deploy.py file
You can find more details about the deploy.py file in the Deploy Job Programatically section. Here’s a brief explanation of the file:
After running the command mentioned above, wait for the deployment process to complete. Monitor the status until it shows DEPLOY_SUCCESS:
, indicating a successful deployment.
Once deployed, you’ll receive a dashboard access link in the output, typically mentioned as You can find the application on the dashboard:
. Click this link to access the deployment dashboard.
View your deployed job
Congratulations! You’ve successfully deployed your job.
Once you click Submit
, your deployment will be successful in a few seconds, and your job will be displayed as Suspended, indicating that it’s successfully deployed but not running.
You can view all the information about your job following the steps below:
Run your job
To run your Job you will have to trigger it manually. You can trigger it by clicking the Run
button on the dashboard or programatically
Run as CLI
FAQ
How to omit certain files from being built and deployed?
How to omit certain files from being built and deployed?
It’s possible there are certain files in your repository that you don’t want to package in the docker image like like test files, etc.
To exclude specific files from being built and deployed, create a .tfyignore file in the root directory of your project. The .tfyignore
file follows the same rules as the .gitignore
file.
If your repository already has a .gitignore
file, you don’t need to create a .tfyignore
file. Truefoundry will automatically detect the files to ignore.
Where is the docker image built?
Where is the docker image built?
If you are deploying from a Github repository, the docker image is built in Truefoundry control-plane and pushed to your configured container registry.
If you are deploying code from your local machine, the Truefoundry SDK first checks is docker is installed on your local environment. It tries to build using the locally installed Docker, failing which it uses the remote builder on the Truefoundry control-plane. If docker is not installed, the SDK will use the remote builder on the Truefoundry control-plane.
How can I run the job in some interval?
How can I run the job in some interval?
You can schedule your job to run at a specific time using the Schedule Job feature.