In this guide, we’ll deploy a FastAPI service for solving the Iris classification problem. The problem involves predicting the species of an iris flower based on its sepal length, sepal width, petal length, and petal width. There are three species: Iris setosa, Iris versicolor, and Iris virginica.

We’ve already created a FastAPI service for the Iris classification problem, and you can find the code in our GitHub Repository.

Please visit the repository to familiarize yourself with the code you’ll be deploying. The project files are organized as follows:

Directory Structure
.
├── app.py - Contains FastAPI code for inference.
├── iris_classifier.joblib - The model file.
└── requirements.txt - Lists dependencies.

Getting Started With Deployment

To deploy a service, you’ll need a workspace. If you don’t have one, you can create it using this guide: Creating a Workspace or seek assistance from your cluster administrator in case you don’t have permission to create a workspace.

In Truefoundry, you can either deploy code from your Github repository or from your local machine in case the code is not pushed to a Github repository.

Use these configs for the deployment form:

What we did above:

In the example above, we only had Python code and a requirements.txt. We didn’t have a prewritten docker file - so we chose the Python Code option - to let TrueFoundry templatize a Dockerfile from the details provided about the application and build the Docker image for us.

We give these details in the Build context field, where we specify the directory in the GitHub repository where our service code resides (./deploy-model-with-fastapi/). We also specify the command that we need to use to run our service (uvicorn app:app --host 0.0.0.0 --port 8000).

Finally, we specify the port that we want our service to listen on (8000).

View your deployed service

Congratulations! You’ve successfully deployed your FastAPI service.

Once you click Submit, your deployment will be successful in a few seconds, and your service will be displayed as active (green), indicating that it’s up and running.

You can view all the information about your service following the steps below:

Copy Endpoint URL

To make a request to the Service, you will need the Endpoint URL.

The endpoint URL is the same one you provided while deploying the service in the Ports Section. You can also copy it from the Service UI.

The endpoint will be an internal cluster URL or an external URL based on whether you chose Expose in the Ports configuration. The two cases are as follows:

The service port is exposed and mapped to a domain

In this case, this becomes a public endpoint and you can call this service from anywhere - your own laptop or code running anywhere.

You can add username-password-based authentication to the API in case you don’t want everyone to be able to call this API. For this, you can refer to the following section Add Authentication to Endpoint.

The service port is not exposed

If you have not exposed the port, the endpoint is internal to the cluster and can only be called by other services running in the same cluster (including Jupyter Notebooks running in the same cluster). No one externally can access this service and this is the recommended mode for APIs that don’t need to be exposed for external usage. These APIs will be of the format servicename-workspacename.svc.cluster.local:port

Sending Requests to your Service

Once you deploy the service, you will want to interact with the API in your code or manually using curl or Postman or Python code as shown below:

import json  
from urllib.parse import urljoin

import requests

# Replace this with the value of your endpoint URL

ENDPOINT_URL = "\<YOUR_ENDPOINT_URL>"  # e.g., https://your-service-endpoint.com

response = requests.post(  
    urljoin(ENDPOINT_URL, 'predict'),  
    params={  
        "sepal_length": 7.0,  
        "sepal_width": 3.2,  
        "petal_length": 4.7,  
        "petal_width": 1.4,  
    }  
)

result = response.json()  
print("Predicted Classes:", result["prediction"])