Model Deployment
Model Deployment
For some model frameworks, Truefoundry can generate a model deployment package containing the following:
- Inference code wrapped around a general-purpose web framework like FastAPI or more specialized model servers like Triton.
- A requirements.txt. We also generate a Dockerfile for model servers like Triton, which are complex to set up on all operating systems.
- A README file that contains instructions on testing locally and deploying on Truefoundry.
This approach gives you the flexibility to:
- Deploy the package as it is to Truefoundry or anywhere else.
- Change the inference code to add custom business logic and dependencies.
- Test the code locally.
- Maintain the generated code in your version control system (Github, Gitlab, etc.).
To create the deployment package:
- Locate the model you want to deploy in the model registry and click the Deploy button.
- Select a workspace for deployment, and copy the command.
- Execute the command in your terminal to generate the model deployment package.
- Follow the instructions present on the
README.md
.