Truefoundry Docs

Bring your own inference code and model in any framework

There are different ways to deploy the models as an API - depending on the framework / type of model.

Model Deployment Options

TrueFoundry doesn’t provide any client side framework for model deployment. We believe there are a lot of great open-source frameworks to build inference services and we don’t want to build yet another one. This also helps avoid any vendor lock-in with TrueFoundry since you don’t need to change your code to deploy models on TrueFoundry or migrate to another platform. TrueFoundry supports deploying models from different frameworks. You can deploy models from HuggingFace, or any custom models that you have logged in the TrueFoundry model registry or your own custom inference code in any of the frameworks. All model API deployments in TrueFoundry are abstractions on top of the Service Deployment feature. That’s why its highly recommended to get familiar with service deployment first.

Bring your own inference code and model in any framework

TrueFoundry can deploy model inference code in any framework that you are using. Here are a few examples of deploying models in most commonly used frameworks and model servers:

HuggingFace

Deploy Transformers / Diffusers models with vLLM, SGLang, Nvidia Triton, etc.

Scikit Learn & XGBoost

Deploy Scikit Learn and XGBoost models with FastAPI or Nvidia PyTriton.

FastAPI

Most flexible option that can wrap any inference code.

LitServe

Wrap any model with LitServe with optional features like dynamic batching and advanced features.

AWS Multi Model Server

Deploy models with AWS Multi Model Server.

TorchServe

Deploy Pytorch models with TorchServe.

TensorFlow Serve

Deploy TensorFlow models with TensorFlow Serve.

Mlflow Serve

Deploy Mlflow models with Mlflow Serve.

Model Registry HuggingFace

⌘I

Getting Started

Train and Deploy Models

Service Deployment

Job Deployment

LLM Deployment

LLM Finetuning

Workflow Deployment

Async Service Deployment

Volumes

ML Repository

LLM Tracing

Platform

Deploying On Your Own Cloud

Overview

Bring your own inference code and model in any framework

HuggingFace

Scikit Learn & XGBoost

FastAPI

LitServe

AWS Multi Model Server

TorchServe

TensorFlow Serve

Mlflow Serve

Getting Started

Train and Deploy Models

Service Deployment

Job Deployment

LLM Deployment

LLM Finetuning

Workflow Deployment

Async Service Deployment

Volumes

ML Repository

LLM Tracing

Platform

Deploying On Your Own Cloud

​Bring your own inference code and model in any framework

HuggingFace

Scikit Learn & XGBoost

FastAPI

LitServe

AWS Multi Model Server

TorchServe

TensorFlow Serve

Mlflow Serve

Bring your own inference code and model in any framework