Existing Code
A Sagemaker deployment typically contains code in the form of the following file tree -- inference.py - This is the inference handler that implements the Sagemaker functions like model_fn, input_fn, predict_fn, output_fn etc
- requirements.txt - This contains any additional Python packages needed by the inference handler
- Model artifacts - Generated model files (e.g.
model.pth
). These may reside on your S3 buckets. - Sagemaker deployment code (e.g.
sagemaker_deploy.py
) - Code to call Sagemaker to deploy the model as an endpoint
Deploying the model on TrueFoundry
Broadly speaking these are the things we shall do -- Enclose the inference handler within a Docker container containing
torchserve
to support pytorch-based models - Upload the model artifact as a TrueFoundry Artifact to make it accessible from the running container
- Launch a TrueFoundry deployment utilizing the above two pieces
1
Upload the Pytorch model artifacts to the TrueFoundry Model Registry
The existing model will look something like:Upload the model to the TrueFoundry Model registry either via code or UI.
upload_model.py
2
Create a Python script to launch the torchserve process at startup
main.py
3
- Next, we’ll write a
Dockerfile
that can create the TrueFoundry application
4
-
Now let’s go ahead and write a
deploy.py
script that can be used with TrueFoundry to get a service deployed. Here you’ll need to change the following- Service Name - Name for the service we’ll deploy
- Entrypoint Script Name (value for
SAGEMAKER_PROGRAM
) - The code file name containingmodel_fn
,input_fn
,predict_fn
andoutput_fn
- Model Version FQN - The FQN obtained from
upload_model.py
5
Deploy using
truefoundry
6
Once the deployment has gone through, it can be tested using this script -
test_endpoint.py