Scikit Learn / XGBoost
Deploying Scikit Learn and XGBoost models with FastAPI or PyTriton
TrueFoundry can autogenerate the inference code for SkLearn and XGBoost models. In case you have already written the inference code for these models, you can deploy the FastAPI/Flask code as it is to TrueFoundry. This guide will go into how to log the models, generate the inference code and deploy the code to get the model endpoint.
TrueFoundry can generate inference code in two frameworks:
- FastAPI: This is simple to understand and use. This works quite well in case your traffic is not very high (less than 20 requests/second>)
- Triton: This is more performant model server and is suitable for high traffic use cases. It comes with batching support which helps provide higher througput.
It also generates the requirements.txt, Dockerfile and a README file that will help you get started with the deployment.
This approach gives you the flexibility to change the inference code to add custom business logic and makes it easier to test the code locally. You can also push the code to your git repository.
Live Demo
You can view a XGBoost example deployed with PyTriton here.
Log the model in the model registry
You will need to setup CLI before executing the following steps.
Generate the inference code
- Locate the model you want to deploy in the model registry and click the Deploy button.
Select a workspace for deployment, and copy the command.
- Execute the command in your terminal to generate the model deployment package.
- Follow the instructions present on the
README.md
to deploy the code and get an endpoint for the model.
Common Issues and FAQ
Deploy Button is not showing up next to a SkLearn/XGBoost model in the model registry
Deploy Button is not showing up next to a SkLearn/XGBoost model in the model registry
Python version < 3.8 and > 3.12 is not supported for Triton deployment
Python version < 3.8 and > 3.12 is not supported for Triton deployment
The Triton deployment depends on the nvidia-pytriton library (https://pypi.org/project/nvidia-pytriton/) which supports Python versions >=3.8 and <=3.12
. If you need to use a version outside this range, consider using FastAPI as an alternative framework for serving the model.
Numpy version must be less than 2.0.0 for Triton deployment
Numpy version must be less than 2.0.0 for Triton deployment
The nvidia-pytriton library specifies in its pyproject.toml file that it does not support numpy versions < 2.0. This limitation has been confirmed through practical experience. If you need to use a version outside this range, consider using FastAPI as an alternative framework for serving the model.