Training and logging your job metadata
Agenda
In this guide we will :-
- Train a scikit-learn model
- Log the model and model-metadata via mlfoundry
- Deploy the training code as a job via servicefoundry
Prerequisites
Before we start, we will need:
- A Workspace FQN - We can use an existing workspace or create one from the Workspaces page. If you already have a Workspace you can use that. Copy and note down the workspace FQN.
-
Since we are pushing our model to Truefoundry Model Registry we will need to add our Truefoundry API Key as a Secret.
-
Create and copy an API Key from the Settings page.
-
Visit Secrets dashboard and Create a new Secret Group.
-
Create a new Secret in this Secret group and Paste your API Key from Step 1.
-
Once saved, note down the Secret FQN by clicking the Copy button beside the value. It would look like following:
<username>:<secret-group-name>:<secret-name>
(E.g.user:iris-train-job:MLF_API_KEY
)
-
NOTE: A workspace is a resource (CPU, Memory) bound environment where we deploy jobs, services.
File Structure
We will require to create the following files for this guide:-
- train.py :- containing our training code
- requirements.txt :- contains our dependencies
- deploy.py :- contains our deployment code
The final file structure will be like this:-
.
├── train.py
├── requirements.txt
└── deploy.py
Training Code
requirements.txt
requirements.txt
The file contains our dependencies.
pandas
numpy
scikit-learn
pickle-mixin
# for experiment tracking and model registry
mlfoundry
# for deploying our job deployments
servicefoundry
train.py
train.py
This file fetches the data, trains the model and pushes it to model registry.
Follow this recipe to understand the train.py :-
Running the Training as a Job
Now we will deploy the training code as a job.
A job basically executes the code once.
We can do our training as a job, so that we are able to use our workspaces instead of our local environment for training.
This can be beneficial when we require more resources for training. The compute and memory resources are released once the job is completed and hence we don't incur any cost once the job completes.
deploy.py
deploy.py
This file deploys our training code as a job.
Follow this recipe to understand the deployment.py :-
Now you can go ahead and write the following command in your terminal:-
python deploy.py
This will go ahead and deploy your training code as a job to be executed in your workspace.
On successful deployment, the Job will be created and run immediately.
We can now visit our Applications page to check Build status, Build Logs, Runs History and monitor progress of runs.
Updated 3 months ago