Migrate Sagemaker Pytorch Endpoint
This guide describes how to deploy a Pytorch models deployed via Sagemaker endpoint in Truefoundry. For this we will need to adapt the existing inference script in Sagemaker to the TrueFoundry platform.
Existing Code
A Sagemaker deployment typically contains code in the form of the following file tree -
- inference.py - This is the inference handler that implements the Sagemaker functions like model_fn, input_fn, predict_fn, output_fn etc
- requirements.txt - This contains any additional Python packages needed by the inference handler
Apart from these, there are also:
- Model artifacts - Generated model files (e.g.
model.pth
). These may reside on your S3 buckets. - Sagemaker deployment code (e.g.
sagemaker_deploy.py
) - Code to call Sagemaker to deploy the model as an endpoint
A sample example is shown below:
Deploying the model on Truefoundry
Broadly speaking these are the things we shall do -
- Enclose the inference handler within a Docker container containing
torchserve
to support pytorch-based models - Upload the model artifact as a TrueFoundry Artifact to make it accessible from the running container
- Launch a TrueFoundry deployment utilizing the above two pieces
Upload the Pytorch model artifacts to the TrueFoundry Model Registry
The existing model will look something like:
Upload the model to the Truefoundry Model registry either via code or UI.
To upload the model via UI, navigate to the model registry and click on the Upload Model
button. You can read the guide to learn more.
Create a Python script to launch the torchserve process at startup
- Next, we’ll write a
Dockerfile
that can create the TrueFoundry application
-
Now let’s go ahead and write a
deploy.py
script that can be used with TrueFoundry to get a service deployed. Here you’ll need to change the following- Service Name - Name for the service we’ll deploy
- Entrypoint Script Name (value for
SAGEMAKER_PROGRAM
) - The code file name containingmodel_fn
,input_fn
,predict_fn
andoutput_fn
- Model Version FQN - The FQN obtained from
upload_model.py
Deploy using truefoundry
Once the deployment has gone through, it can be tested using this script -