Deploy a Gradio Service

πŸ‘

What you'll learn

  • Creating a Gradio application to serve your model
  • Deploying your service via servicefoundry

This is a guide to deploy a scikit-learn model via Gradio and servicefoundry

After you complete the guide, you will have a successfully deployed Gradio Service. Your deployed Gradio Service will look like this:

Project structure

To complete this guide, you are going to create the following files:

  • app.py: contains our inference and Gradio code
  • iris_classifier.joblib: the model file
  • deploy.py/deploy.yaml: contains our deployment code / deployment configuration. (Depending on whether you choose to use our python SDK or create a YAML file)
  • requirements.txt: contains our dependencies.

Your final file structure is going to look like this:

.
β”œβ”€β”€ app.py
β”œβ”€β”€ iris_classifier.joblib
β”œβ”€β”€ deploy.py / deploy.yaml
└── requirements.txt

As you can see, all the following files are created in the same folder/directory.

Model details

For this guide, we have already trained a model.
The given model has been trained on the Iris dataset. Then it is stored as a joblib file in google drive.

Attributes :
sepal length in cm, sepal width in cm, petal length in cm, petal width in cm

Predicted Attribute :
class of iris plant (one of the following - Iris Setosa, Iris Versicolour, Iris Virginica)

Step 1: Fetching the model

Download the model from the following link.
Then move the model in your development directory.

Afterwards, your directory should be like this :

.
└── iris_classifier.joblib

Step 2: Implement the inference service code.

The first step is to create a web Interface and deploy the model.
For this, we are going to use Gradio for this. Gradio is a python library using which we can quickly create a web interface on top of our model inference functions.

Create the app.py and requirements.txt files in the same directory where the model is stored.

.
β”œβ”€β”€ iris_classifier.joblib
β”œβ”€β”€ app.py
└── requirements.txt

app.py

import os
import joblib
import pandas as pd
import gradio as gr

model = joblib.load("iris_classifier.joblib")

def model_inference(sepal_length: float, sepal_width: float, petal_length: float, petal_width: float) -> int:
    data = dict(
        sepal_length=sepal_length,
        sepal_width=sepal_width,
        petal_length=petal_length,
        petal_width=petal_width,
    )
    prediction = int(model.predict(pd.DataFrame([data]))[0])
    return prediction

sepal_length_input = gr.Number(label = "Enter the sepal length in cm")
sepal_width_input = gr.Number(label = "Enter the sepal width in cm")
petal_length_input = gr.Number(label = "Enter the petal length in cm")
petal_width_input = gr.Number(label = "Enter the petal width in cm")

inputs = [sepal_length_input, sepal_width_input, petal_length_input, petal_width_input]

output = gr.Number()

gr.Interface(
    fn=model_inference,
    inputs=inputs,
    outputs=output,
).launch(server_name="0.0.0.0", server_port=8080)

Click on the Open Recipe below to understand the app.py:

requirements.txt

pandas
gradio
scikit-learn
joblib
altair

Step 3: Deploying the inference API

You can deploy services on TrueFoundry programmatically either using our Python SDK, or via a YAML file.

So now you can choose between either creating a deploy.py file, which will use our Python SDK.
Or
You can choose to create a deploy.yaml configuration file and then use the servicefoundry deploy command.

Via python SDK

File Structure

.
β”œβ”€β”€ iris_classifier.joblib
β”œβ”€β”€ app.py
β”œβ”€β”€ deploy.py
└── requirements.txt

deploy.py

import argparse
import logging
from servicefoundry import Build, PythonBuild, Service, Resources, Port

logging.basicConfig(level=logging.INFO)

parser = argparse.ArgumentParser()
parser.add_argument("--workspace_fqn", required=True, type=str)
args = parser.parse_args()

service = Service(
    name="gradio",
    image=Build(
        build_spec=PythonBuild(
            command="python app.py",
            requirements_path="requirements.txt",
        )
        ),
    ports=[
        Port(
            port=8080,
            host="<Provide a host value based on your configured domain>"
        )
    ],
    resources=Resources(memory_limit=1500, memory_request=1000),
)
service.deploy(workspace_fqn=args.workspace_fqn)

Follow the recipe below to understand the deploy.py file :

To deploy using Python API use:

python deploy.py --workspace_fqn <YOUR WORKSPACE FQN HERE>

Run the above command from the same directory containing the app.py and requirements.txt files.

Via YAML file

File Structure

.
β”œβ”€β”€ iris_classifier.joblib
β”œβ”€β”€ app.py
β”œβ”€β”€ deploy.yaml
└── requirements.txt

deploy.yaml

name: gradio
type: service
image:
  type: build
  build_source:
    type: local
  build_spec:
    type: tfy-python-buildpack
    command: python app.py
ports:
  - port: 8080
    host: <Provide a host value based on your configured domain>
resources:
  memory_limit: 1500
  memory_request: 1000

Follow the recipe below to understand the deploy.yaml code:

With YAML you can deploy the inference API service using the command below:

servicefoundry deploy --workspace-fqn YOUR_WORKSPACE_FQN --file deploy.yaml

Run the above command from the same directory containing the app.py and requirements.txt files.

Interact with the service

After you run the command given above, you will get a link at the end of the output. The link will take you to your application's dashboard.

Once the build is complete you should get the endpoint for your service :-

Click on the endpoint, and it will open you deployed Gradio service.

Now you can enter your data and get the output.

Next Steps