Resources
Resources
For all deployments, we need to specify the resource constraints for the application so that it can be deployed accordingly on the cluster. The key resources to be specified are:
CPU
CPU represents compute processing and is specified as a number. 1 CPU unit is equivalent to 1 physical CPU core, or 1 virtual core, depending on whether the node is a physical host or a virtual machine running inside a physical machine.
We can also specify fractional CPU like 0.2
or even 0.02
.
For CPU, we need to specify a cpu_request
and cpu_limit
param.
cpu_request
helps specify the amount of CPU that will always be reserved for the application. This also means that the minimum cost that you incur for this application is going to be the cost of cpu_request
number of CPUs. If the application always uses CPU lower than the cpu_request
, you can
assume it to run in a healthy way.
cpu_limit
helps specify the upper limit on cpu usage of the application, beyond which the application will be throttled and not allowed any more CPU usage. This helps safeguard other applications running on the same node since one misbehaving application cannot interfere and reduce the resources available to the other applications.
cpu_limit
has to be always greater than or equal to cpu_request
.
Google Cloud Autopilot GKE
For Google Cloud Autopilot GKE,
cpu_request
andcpu_limit
must be equal, otherwisecpu_limit
would be ignored and automatically set equal tocpu_request
. Check this page for more information
How do you set cpu_request
and cpu_limit
?
If your application is taking, let's say 0.5 CPU in steady state and during peak times goes to 0.8 CPU, then request should be 0.5 and limit can be 0.9 (just to be safe). In general, cpu_request should be somewhere around the steady state usage and limit can account for the peak usage.
Memory
Memory is defined as an integer and the unit is Megabytes. So a value of 1 means 1 MB of memory and 1000 means 1GB of memory.
Memory also has two fields: memory_request
and memory_limit
. Memory request defines the minimum amount of memory needed to run the application. If you think that your app requires at least 256MB of memory to operate, this is the request value.
memory_limit
defines the max amount of memory that the application can use. If the application tries to use more memory, then it will be killed and OOM (Out of memory) error will show up on the pods.
If the memory usage of your application increases during peak or because of some other events, it is advisable to keep the memory limit around the peak memory usage.
Keeping memory limit below the usual memory usage will result in OOM killing of the pods.
memory_limit
has to be always greater than or equal to memory_request
.
Google Cloud Autopilot GKE
For Google Cloud Autopilot GKE,
memory_request
andmemory_limit
must be equal, otherwisememory_limit
would be ignored and automatically set equal tomemory_request
. Check this page for more information
Storage
Storage is defined as an integer and the unit is Megabytes. A value of 1 means 1 MB of disk space and 1000 means 1GB of disk space.
Storage has two fields: ephemeral_storage_request
and ephemeral_storage_limit
. Ephemeral storage request defines the minimum amount of disk space your application needs to run. If 1GB of disk space is what your application requires, then that is the request value.
ephemeral_storage_limit
defines the max amount of disk space that the application will be allowed to use. Going beyond this limit will result in the application being killed with the pods being evicted.
The disk space being allocated to the application is completely ephemeral and is intended to be used purely as temporary space
ephemeral_storage_limit
has to be always greater than or equal to ephemeral_storage_request
.
Google Cloud Autopilot GKE
For Google Cloud Autopilot GKE,
ephemeral_storage_request
andephemeral_storage_limit
must be equal, otherwiseephemeral_storage_limit
would be ignored and automatically set equal toephemeral_storage_request
. Check this page for more information
GPU
See GPUs page for more information
Setting resources for Truefoundry applications
# `Service` and `Job`, both have a `resource` argument where you can either pass an instance of the `Resources` class or a `dict`.
import logging
from servicefoundry import Build, Service, DockerFileBuild, Resources
logging.basicConfig(level=logging.INFO)
service = Service(
name="service",
image=Build(build_spec=DockerFileBuild()),
ports=[{"port": 8501}],
resources=Resources( # You can use this argument in `Job` too.
cpu_request=0.2,
cpu_limit=0.5,
memory_request=128,
memory_limit=512,
),
)
service.deploy(workspace_fqn="YOUR_WORKSPACE_FQN")
#You can defined the resource fields as a key-value pair under the `resources` field.
name: service
components:
- name: service
type: service
image:
type: build
build_source:
type: local
build_spec:
type: dockerfile
ports:
- port: 8501
resources: # You can use this block in `job` too.
cpu_request: 0.2
cpu_limit: 0.5
memory_request: 128
memory_limit: 512
We set the following defaults if you do not configure any resources field.
Field | Default value | Unit |
---|---|---|
cpu_request | 0.2 | - |
cpu_limit | 0.5 | - |
memory_request | 200 | MB |
memory_limit | 500 | MB |
ephemeral_storage_request | 1000 | MB |
ephemeral_storage_limit | 2000 | MB |
Updated about 1 month ago