Liveness/Readiness Probe

Truefoundry helps you deploy on Kubernetes and one of the key advantages here is that Kubernetes will automatically restart your process if something goes wrong. It can also stop routing requests to a replica of a service which is not yet ready or healthy to serve the traffic. The way this works is by telling Kubernetes to keep checking at some regular intervals some API endpoints on your service to detect if the service is ready and healthy. There are two types of such health checks:

  • Liveness probe: Checks whether the service's replica is currently healthy. If the service is not healthy, the replica will be terminated and another one will be started.
  • Readiness probe: Checks whether the service's replica is ready to receive traffic. Until the readiness probe succeeds, no incoming traffic will be routed to this replica.

The main difference between liveness and readiness probes is that liveness probes check if the service is running, while readiness probes check if the service is ready to serve requests. Readiness probes are also used to prevent traffic from being routed to replicas that are still starting up or that are in the process of being restarted.

For example, a machine learning service may take some time to load its model before it is ready to serve requests. During this time, the service would still be considered to be live (i.e., the liveness probe would return a success response), but it would not be ready to serve requests (i.e., the readiness probe would not return a success response). This would prevent traffic from being routed to that replica until the model has been loaded and initialized.

How to add Liveness and Readiness Probes to your deployed services

To add liveness and readiness probes to your deployed service, you will need to:

  • Create two endpoints in your service, one for the liveness probe and one for the readiness probe. The endpoints should return a successful response if the service is healthy and a failed response if the service is unhealthy.
  • Configure your deployment to use the health check endpoints. You can do this via the user interface (UI) or via the Python SDK.

Example FastAPI health check endpoints

The following example shows two simple FastAPI health check endpoints /livez and /readyz that will be used by the Health Checks:

from contextlib import asynccontextmanager

from fastapi import FastAPI

app = FastAPI(root_path=os.getenv("TFY_SERVICE_ROOT_PATH"))

# Flag to indicate whether the model is loaded and ready
model_loaded = False

@app.get("/livez")
def liveness():
  # Check if the service is running
  return True


@app.get("/readyz")
def readyness():
  # Check if the service is ready to receive traffic
  # For example, check if the model is loaded and if the initialization task is complete.
  # Once loaded, set the model_loaded flag to True
  return model_loaded


@app.get("/")
async def root():
  return {"message": "Hello World"}

Configuring Health Check for your Service

Here are the parameters for liveness and readiness probes:

ParameterDescription
pathThe path to the health check endpoint.
portThe port on which the health check endpoint is listening.
initial_delay_secondsNumber of seconds to wait after the container has started before checking the endpoint.
period_secondsHow often to check the endpoint.
timeout_secondsNumber of seconds to wait for a response from the endpoint before considering it to be down.
success_thresholdNumber of times in a row the endpoint must respond successfully before the container is considered to be healthy.
failure_thresholdNumber of times in a row the endpoint can fail to respond before the container is considered to be down.

For example, let's say we have the following configuration:

ParameterValue
path/health
port8080
initial_delay_seconds10
period_seconds5
timeout_seconds2
success_threshold3
failure_threshold2

This configuration will cause the probe to check the /health endpoint on port 8080 every 5 seconds. The probe will wait for 2 seconds for a response before failing the probe. The probe will require 3 consecutive successful probes before marking the container as healthy, and it will allow 2 consecutive failed probes before marking the container as unhealthy.

Via the User Interface (UI)

Via the Python SDK

In your Service deployment code deploy.py, include the following:

from truefoundry.deploy import Build, Service, DockerFileBuild, Port, HealthProve, HttpProbe

service = Service(
    name="my-service",
    image=Build(build_spec=DockerFileBuild()),
    ports=[
      Port(
        host="your_host",
        port=8501
      )
    ]
+    liveness_probe=HealthProbe(
+        config=HttpProbe(path="/livez", port=8000), #Replace /livez with application specific path
+        initial_delay_seconds=0,
+        period_seconds=10,
+        timeout_seconds=1,
+        success_threshold=1,
+        failure_threshold=3,
+    ),
+    readiness_probe=HealthProbe(
+        config=HttpProbe(path="/readyz", port=8000), #Replace /readyz with application specific path
+        period_seconds=5,
+    ),
)
service.deploy(workspace_fqn="YOUR_WORKSPACE_FQN")

Viewing health check logs

You can go to the Pods tab on the dashboard, and then click on the logs.

You will be able to see the logs of the replica, including the logs of the health check probes.

The GET /livez and GET /readyz giving 200 OK messages indicate that the health checks are passing and that the service is healthy.