This feature is in the Beta stage - the input parameters and Python APIs can change.
We also have plans to add more model formats and frameworks, model stores and easy inference and testing tools.

We would love to hear from you if you have any feedback or run into any issues.

In this example, we will train a model on Iris flower classification dataset and then deploy it as a Service that can be used for inference.



We have added sample training code for completeness' sake. If you already have a trained model skip to the deployment part

Training a Model

We can also use Truefoundry Jobs to train this model in the cloud :wink:

Our initial code structure looks like follows:

β”œβ”€β”€ requirements.txt

requirements.txt contains the list of dependencies we would need:


# For logging the trained model

# For deploying it as a Service

Install these in a Python environment using pip install -r requirements.txt

Next, let's look at We are using Truefoundry's Model Registry to log the model

import mlfoundry
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

X, y = load_iris(as_frame=True, return_X_y=True)
X = X.rename(columns={
    "sepal length (cm)": "sepal_length",
    "sepal width (cm)": "sepal_width",
    "petal length (cm)": "petal_length",
    "petal width (cm)": "petal_width",

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
pipe = Pipeline([("scaler", StandardScaler()), ("svc", SVC(probability=True))]), y_train)

client = mlfoundry.get_client()
run = client.create_run(project_name="iris-clf", run_name="iris-svc")

# Optionally, we can log hyperparameters using run.log_params(...)
# Optionally, we can log metrics using run.log_metrics(...)

model_version = run.log_model(name="iris-svc", model=pipe, framework="sklearn", step=0)
print("Model Logged as:", model_version.fqn)

We can now run this script as python This will train an SVC model and log it to Model Registry and return an FQN (Fully Qualified Name) that we will use later.

A Model Version FQN looks like model:{org_name}/{username}/{project_name}/{model_name}:{version}

Deploying the model

Before we proceed, make sure you have completed the setup steps.

Click to Expand In short, you should have
  1. Signed up on Truefoundry Platform
  2. Installed the servicefoundry SDK: pip install -U "servicefoundry>=0.6.0,<0.7.0"
  3. Logged in: sfy login
  4. Have a Workspace FQN you can deploy the model to. You can find all workspaces here

We will now deploy the model we logged with Truefoundry's Model Registry which gives us a Model Version FQN

A Model Version FQN looks like model:{org_name}/{username}/{project_name}/{model_name}:{version} For e.g. model:truefoundry/user/iris-clf/iris-svc:1

We can deploy our model using either Python code or YAML file and servicefoundry CLI

Deploy using Python code

Create a file with the following code and replace <YOUR_MODEL_VERSION_FQN> and <WORKSPACE> with your values.

import logging
from servicefoundry import ModelDeployment, TruefoundryModelRegistry, Resources

logging.basicConfig(level=logging.INFO, format=logging.BASIC_FORMAT)

# Replace these with your values
MODEL_VERSION_FQN = "<YOUR_MODEL_VERSION_FQN>" # E.g. model:truefoundry/user/iris-clf/iris-svc:1 
WORKSPACE = "<YOUR_WORKSPACE_FQN>" # E.g. tfy-ctl-euwe1:model-deployment-test

model_deployment = ModelDeployment(
    resources=Resources(cpu_request=0.2, cpu_limit=0.5, memory_request=500, memory_limit=1000)
deployment = model_deployment.deploy(workspace_fqn=WORKSPACE)

Run this as python from a shell

Deploy using CLI

Create a servicefoundry.yaml file with the following spec and replace <YOUR_MODEL_VERSION_FQN>

# Replace <YOUR_MODEL_VERSION_FQN> with your model version FQN - E.g. model:truefoundry/user/iris-clf/iris-svc:1 
name: iris-svc
type: model-deployment
  type: tfy-model-registry
  model_version_fqn: <YOUR_MODEL_VERSION_FQN>
  cpu_request: 0.2
  cpu_limit: 0.5
  memory_request: 500
  memory_limit: 1000

Run this as sfy deploy --workspace-fqn <YOUR_WORKSPACE_FQN> from a shell

Testing the model

On running the above script, your model will be deployed and an Endpoint should be available on the UI Dashboard.


Endpoint URL is available on the Model Deployment Details Page



Behind the scenes, our model is being deployed using MLServer and Kserve's V2 Dataplane protocol

The general format of the Inference URL thus looks like


We will now send the following examples for prediction.


Here is an example Python code snippet to send a request with the above data.

import json
from urllib.parse import urljoin

import requests

# Replace this with the value of your endpoint 
MODEL_NAME = "iris-svc"

response =
    urljoin(ENDPOINT_URL, f'v2/models/{MODEL_NAME}/infer'),
        "inputs": [
            {"data": [7.0, 5.0], "datatype": "FP32", "name": "sepal_length", "shape": [2]},
            {"data": [3.2, 3.6], "datatype": "FP32", "name": "sepal_width", "shape": [2]},
            {"data": [4.7, 1.4], "datatype": "FP32", "name": "petal_length", "shape": [2]},
            {"data": [1.4, 0.2], "datatype": "FP32", "name": "petal_width", "shape": [2]}
        "parameters": {
            "content_type": "pd"
result = response.json()
print(json.dumps(result, indent=4))
print("Predicted Classes:", result["outputs"][0]["data"])
# Replace this with the values

curl -X POST \
     -H 'Content-Type: application/json' \
     -d '{"inputs":[{"data":[7,5],"datatype":"FP32","name":"sepal_length","shape":[2]},{"data":[3.2,3.6],"datatype":"FP32","name":"sepal_width","shape":[2]},{"data":[4.7,1.4],"datatype":"FP32","name":"petal_length","shape":[2]},{"data":[1.4,0.2],"datatype":"FP32","name":"petal_width","shape":[2]}],"parameters":{"content_type":"pd"}}' \

This should return something like

    "model_name": "iris-svc",
    "model_version": null,
    "id": "4fc82cea-4fc2-43f6-97e5-5f67a49ad37f",
    "parameters": {
        "content_type": null,
        "headers": null
    "outputs": [
            "name": "output-1",
            "shape": [
            "datatype": "INT64",
            "parameters": null,
            "data": [

Note on content_type

Notice in the request we have

    "inputs": ...,
    "parameters": {
        "content_type": "pd"

We added this because when training the model we used pandas dataframe. As such our model expects a dataframe with correct feature names when calling predict.

Alternatively, if we had not used a dataframe and simply a numpy array or list during training, our request body would looks something like

  "inputs": [
      "name": "input-0",
      "shape": [2, 4],
      "datatype": "FP32",
      "data": [
          [7.0, 3.2, 4.7, 1.4],
          [5.0, 3.6, 1.4, 0.2],