For all deployments, we need to specify the resource constraints for the application so that it can be deployed accordingly on the cluster. The key resources to be specified are:


CPU represents compute processing and is specified as a number. 1 CPU unit is equivalent to 1 physical CPU core, or 1 virtual core, depending on whether the node is a physical host or a virtual machine running inside a physical machine.
We can also specify fractional CPU like 0.2 or even 0.02.

For CPU, we need to specify a cpu_request and cpu_limit param.

cpu_request helps specify the amount of CPU that will always be reserved for the application. This also means that the minimum cost that you incur for this application is going to be the cost of cpu_request number of CPUs. If the application always uses CPU lower than the cpu_request, you can
assume it to run in a healthy way.

cpu_limit helps specify the upper limit on cpu usage of the application, beyond which the application will be throttled and not allowed any more CPU usage. This helps safeguard other applications running on the same node since one misbehaving application cannot interfere and reduce the resources available to the other applications.

cpu_limit has to be always greater than or equal to cpu_request.

How do you set cpu_request and cpu_limit ?

If your application is taking, let's say 0.5 CPU in steady state and during peak times goes to 0.8 CPU, then request should be 0.5 and limit can be 0.9 (just to be safe). In general, cpu_request should be somewhere around the steady state usage and limit can account for the peak usage.


Memory is defined as an integer and the unit is Megabytes. So a value of 1 means 1 MB of memory and 1000 means 1GB of memory.
Memory also has two fields: memory_request and memory_limit. Memory request defines the minimum amount of memory needed to run the application. If you think that your app requires at least 256MB of memory to operate, this is the request value.

memory_limit defines the max amount of memory that the application can use. If the application tries to use more memory, then it will be killed and OOM (Out of memory) error will show up on the pods.

If the memory usage of your application increases during peak or because of some other events, its advisable to keep the memory limit around the peak memory usage.

Keeping memory limit below the usual memory usage will result in OOM killing of the pods.


Storage is defined as an integer and the unit is Megabytes. A value of 1 means 1 MB of disk space and 1000 means 1GB of disk space.

Storage has two fields: ephemeral_storage_request and ephemeral_storage_limit. Ephemeral storage request defines the minimum amount of disk space your application needs to run. If 1GB of disk space is what your application requires, then that is the request values.

ephemeral_storage_limit defines the max amount of disk space that the application will be allowed to use. Going beyond this limit will result in the application being killed with the pods being evicted.

The disk space being allocated to the application is completely ephemeral and is intended to be used purely as a scratch space

GPU (Coming soon)

Setting resources for truefoundry applications

# `Service` and `Job`, both have a `resource` argument where you can either pass an instance of the `Resources` class or a `dict`.

import logging

from servicefoundry import Build, Service, DockerFileBuild, Resources

service = Service(
    ports=[{"port": 8501}],
    resource=Resources( # You can use this argument in `Job` too.
#You can defined the resource fields as a key-value pair under the `resources` field.

name: service
  - name: service
    type: service
      type: build
        type: local
        type: dockerfile
     - port: 8501
    resources: # You can use this block in `job` too.
      cpu_request: 0.2
      cpu_limit: 0.5
      memory_request: 128
      memory_limit: 512

We set the following defaults if you do not configure any resources field.

FieldDefault value