Add health checks
What you'll learn
- What are Health Checks
- How to add Liveness and Readiness Probe's to your deployed services
This is a guide to add health checks to your deployed services.
Health Checks
Health checks allow you to detect when the service is healthy or not. This helps to route the incoming traffic to only the healthy instances and restart or terminate the containers that are not healthy. We can currently configure two types of health checks - the liveness probe and readiness probe.
Liveness Probe
Liveness probe checks whether the service is currently healthy by making a request to an endpoint of the service. If the service is not healthy, the container will be terminated and another one will be restarted. We can configure all parameters of the liveness probe according to our needs:
Readiness Probe
Readiness probe checks whether the container is ready to receive traffic. Until the readiness probe succeeds, no incoming traffic will be routed to this container. Like the liveness probe, this is also achieved by making a request to an endpoint and checking if the endpoint responds with a successful response(any HTTP code >= 200 and < 400). This is usually useful when the service is doing some heavy work like loading a model which can take significant time - during this period, we don't want to route any traffic to this container since the model is not loaded yet.
Liveness vs Readiness Probe
- Liveness probes check if an application is still running
- Readiness probes check if an application is ready to receive traffic.
Both types of probes are important for ensuring the availability and stability of Machine Learning Deployment
Health Check Configuration
HealthChecks can be configured using the the following parameters:
- HttpRequest Configuration: (Response is considered successful if http status code is >=200 and < 400)
- port: Set the port to send the HTTP request to.
- path: The endpoint path to send the request to
- initial_delay_seconds: Number of seconds after the container is started before the first probe is initiated. Defaults to 0.
- period_seconds: - How often, in seconds, to execute the probe. Defaults to 10.
- timeout_seconds: - Number of seconds after which the probe times out. (Defaults to 1)
- success_threshold: - Minimum consecutive successes for the probe to be considered successful after having failed (Defaults to 1)
- failure_threshold: - Number of consecutive failures required to determine the container is not alive for liveness probe or not ready for readiness probe (Defaults to 3)
Step 1: Implement service code
We will firstly create a FastAPI service, and add two routes there, namely:-
- livez
- readyz
File Structure:
.
βββ main.py
main.py
main.py
from fastapi import FastAPI
app = FastAPI(root_path=os.getenv("TFY_SERVICE_ROOT_PATH"))
@app.get("/livez")
def liveness():
return True
@app.get("/readyz")
def readyness():
return True
@app.get("/")
async def root():
return {"message": "Hello World"}
Step 2: Adding the Health Checks
Depending on wether you are deploying via the Python SDK or via creating a yaml configuration file, you can open the following recipes:-
Via Python SDK
File Structure:
.
βββ main.py
βββ deploy.py
deploy.py
deploy.py
In the code below, ensure to replace "YOUR_WORKSPACE_FQN" in the last line with your WORKSPACE_FQN
import logging
from servicefoundry import (
Build,
PythonBuild,
Service,
HttpProbe,
HealthProbe,
)
logging.basicConfig(level=logging.INFO)
parser = argparse.ArgumentParser()
parser.add_argument("--workspace_fqn", required=True, type=str)
args = parser.parse_args()
image = Build(
build_spec=PythonBuild(
command="uvicorn main:app --port 8000 --host 0.0.0.0",
pip_packages=["fastapi==0.81.0", "uvicorn==0.18.3"],
)
)
service = Service(
name="svc-health",
image=image,
ports=[{"port": 8000}],
liveness_probe=HealthProbe(
config=HttpProbe(path="/livez", port=8000),
initial_delay_seconds=0,
period_seconds=10,
timeout_seconds=1,
success_threshold=1,
failure_threshold=3,
),
readiness_probe=HealthProbe(
config=HttpProbe(path="/readyz", port=8000),
period_seconds=5,
),
)
service.deploy(workspace_fqn=args.workspace_fqn)
Follow the recipe below to understand the deploy.py file :
To deploy using Python API use:
python deploy.py --workspace_fqn <YOUR WORKSPACE FQN HERE>
Run the above command from the same directory containing the
app.py
andrequirements.txt
files.
Via YAML file
File Structure:
.
βββ main.py
βββ deploy.yaml
deploy.yaml
deploy.yaml
name: svc-health
type: service
image:
type: build
build_source:
type: local
build_spec:
type: tfy-python-buildpack
command: uvicorn main:app --port 8000 --host 0.0.0.0
pip_packages:
- fastapi==0.81.0
- uvicorn==0.18.3
ports:
- port: 8080
liveness_probe:
config:
type: http
path: /livez
port: 8000
initial_delay_seconds: 0
period_seconds: 10
timeout_seconds: 1
success_threshold: 1
failure_threshold: 3
readiness_probe:
config:
type: http
path: /readyz
port: 8000
period_seconds: 5
Follow the recipe below to understand the deploy.yaml code:
With YAML you can deploy the inference API service using the command below:
servicefoundry deploy --workspace-fqn YOUR_WORKSPACE_FQN --file deploy.yaml
Run the above command from the same directory containing the
app.py
andrequirements.txt
files.
Interact with the service
After you run the command given above, you will get a link at the end of the output. The link will take you to your application's dashboard.
Once the build is complete you should get the endpoint for your service :-
Click on the endpoint, and it will open you deployed service
You can also go to the Pods tab on the dashboard, and then click on the logs.
You will be able to see our livez and readyz probes work.
Next Steps
Updated 5 months ago