Add health checks

πŸ‘

What you'll learn

  • What are Health Checks
  • How to add Liveness and Readiness Probe's to your deployed services

This is a guide to add health checks to your deployed services.

Health Checks

Health checks allow you to detect when the service is healthy or not. This helps to route the incoming traffic to only the healthy instances and restart or terminate the containers that are not healthy. We can currently configure two types of health checks - the liveness probe and readiness probe.

Liveness Probe

Liveness probe checks whether the service is currently healthy by making a request to an endpoint of the service. If the service is not healthy, the container will be terminated and another one will be restarted. We can configure all parameters of the liveness probe according to our needs:

Readiness Probe

Readiness probe checks whether the container is ready to receive traffic. Until the readiness probe succeeds, no incoming traffic will be routed to this container. Like the liveness probe, this is also achieved by making a request to an endpoint and checking if the endpoint responds with a successful response(any HTTP code >= 200 and < 400). This is usually useful when the service is doing some heavy work like loading a model which can take significant time - during this period, we don't want to route any traffic to this container since the model is not loaded yet.

Liveness vs Readiness Probe

  • Liveness probes check if an application is still running
  • Readiness probes check if an application is ready to receive traffic.

Both types of probes are important for ensuring the availability and stability of Machine Learning Deployment

Health Check Configuration

HealthChecks can be configured using the the following parameters:

  • HttpRequest Configuration: (Response is considered successful if http status code is >=200 and < 400)
    • port: Set the port to send the HTTP request to.
    • path: The endpoint path to send the request to
  • initial_delay_seconds: Number of seconds after the container is started before the first probe is initiated. Defaults to 0.
  • period_seconds: - How often, in seconds, to execute the probe. Defaults to 10.
  • timeout_seconds: - Number of seconds after which the probe times out. (Defaults to 1)
  • success_threshold: - Minimum consecutive successes for the probe to be considered successful after having failed (Defaults to 1)
  • failure_threshold: - Number of consecutive failures required to determine the container is not alive for liveness probe or not ready for readiness probe (Defaults to 3)

Step 1: Implement service code

We will firstly create a FastAPI service, and add two routes there, namely:-

  • livez
  • readyz

File Structure:

.
└── main.py

main.py

from fastapi import FastAPI

app = FastAPI(root_path=os.getenv("TFY_SERVICE_ROOT_PATH"))


@app.get("/livez")
def liveness():
    return True


@app.get("/readyz")
def readyness():
    return True


@app.get("/")
async def root():
    return {"message": "Hello World"}

Step 2: Adding the Health Checks

Depending on wether you are deploying via the Python SDK or via creating a yaml configuration file, you can open the following recipes:-

Via Python SDK

File Structure:

.
β”œβ”€β”€ main.py
└── deploy.py

deploy.py

🚧

In the code below, ensure to replace "YOUR_WORKSPACE_FQN" in the last line with your WORKSPACE_FQN

import logging
from servicefoundry import (
    Build,
    PythonBuild,
    Service,
    HttpProbe,
    HealthProbe,
)

logging.basicConfig(level=logging.INFO)

parser = argparse.ArgumentParser()
parser.add_argument("--workspace_fqn", required=True, type=str)
args = parser.parse_args()

image = Build(
       build_spec=PythonBuild(
         command="uvicorn main:app --port 8000 --host 0.0.0.0",
         pip_packages=["fastapi==0.81.0", "uvicorn==0.18.3"],
       )
)

service = Service(
		name="svc-health",
  	image=image,
  	ports=[{"port": 8000}],
    liveness_probe=HealthProbe(
        config=HttpProbe(path="/livez", port=8000),
        initial_delay_seconds=0,
        period_seconds=10,
        timeout_seconds=1,
        success_threshold=1,
        failure_threshold=3,
    ),
    readiness_probe=HealthProbe(
        config=HttpProbe(path="/readyz", port=8000),
        period_seconds=5,
    ),
)
service.deploy(workspace_fqn=args.workspace_fqn)

Follow the recipe below to understand the deploy.py file :

To deploy using Python API use:

python deploy.py --workspace_fqn <YOUR WORKSPACE FQN HERE>

Run the above command from the same directory containing the app.py and requirements.txt files.

Via YAML file

File Structure:

.
β”œβ”€β”€ main.py
└── deploy.yaml

deploy.yaml

name: svc-health
type: service
image:
  type: build
  build_source:
    type: local
  build_spec:
    type: tfy-python-buildpack
    command: uvicorn main:app --port 8000 --host 0.0.0.0
    pip_packages:
      - fastapi==0.81.0
      - uvicorn==0.18.3
ports:
	- port: 8080
liveness_probe:
  config:
    type: http
    path: /livez
    port: 8000
  initial_delay_seconds: 0
  period_seconds: 10
  timeout_seconds: 1
  success_threshold: 1
  failure_threshold: 3
readiness_probe:
  config:
    type: http
    path: /readyz
    port: 8000
  period_seconds: 5

Follow the recipe below to understand the deploy.yaml code:

With YAML you can deploy the inference API service using the command below:

servicefoundry deploy --workspace-fqn YOUR_WORKSPACE_FQN --file deploy.yaml

Run the above command from the same directory containing the app.py and requirements.txt files.

Interact with the service

After you run the command given above, you will get a link at the end of the output. The link will take you to your application's dashboard.

Once the build is complete you should get the endpoint for your service :-

Click on the endpoint, and it will open you deployed service

You can also go to the Pods tab on the dashboard, and then click on the logs.

You will be able to see our livez and readyz probes work.

Next Steps