Deployments - Additional Configuration

📘

Note

While using ServiceFoundry python SDK type is not a required field in any of the imported classes
For deployments we can use the modules below to add necessary functionalities

Resources

Description

Describes the resource constraints for the application so that it can be deployed accordingly on the cluster
To learn more you can go here

Schema

{
  "cpu_request": 0.2,
  "cpu_limit": 0.5,
  "memory_request": 200,
  "memory_limit": 500,
  "ephemeral_storage_request": 1000,
  "ephemeral_storage_limit": 2000,
  "instance_family": [
    "string"
  ]
}

Properties

NameTypeRequiredDescription
cpu_requestnumbertrueRequested CPU which determines the minimum cost incurred. The CPU usage can exceed the requested
amount, but not the value specified in the limit. 1 CPU means 1 CPU core. Fractional CPU can be requested
like 0.5 or 0.05
cpu_limitnumbertrueCPU limit beyond which the usage cannot be exceeded. 1 CPU means 1 CPU core. Fractional CPU can be requested
like 0.5. CPU limit should be >= cpu request.
memory_requestnumbertrueRequested memory which determines the minimum cost incurred. The unit of memory is in megabytes(MB).
So 1 means 1 MB and 2000 means 2GB.
memory_limitnumbertrueMemory limit after which the application will be killed with an OOM error. The unit of memory is
in megabytes(MB). So 1 means 1 MB and 2000 means 2GB. MemoryLimit should be greater than memory request.
ephemeral_storage_requestnumbertrueRequested disk storage. The unit of memory is in megabytes(MB).
This is ephemeral storage and will be wiped out on pod restarts or eviction
ephemeral_storage_limitnumbertrueDisk storage limit. The unit of memory is in megabytes(MB). Exceeding this limit will result in eviction.
It should be greater than the request. This is ephemeral storage and will be wiped out on pod restarts or eviction
instance_family[string]falseInstance family of the underlying machine to use. Multiple instance families can be supplied.
The workload is guaranteed to be scheduled on one of them.

Python Examples

from servicefoundry import Service, Resources

service = Service(  # or Job or ModelDeployment
    ...
    resources=Resources(
        cpu_request=1,  
        memory_request=1000, # in Megabytes
        ephemeral_storage_request=1000, # in Megabytes
        cpu_limit=4,
        memory_limit=4000,
        ephemeral_storage_limit=10000,
        instance_family=["c6i", "t3", "m4"],
  	)
)

FileMount

Description

Describes the configuration for FileMount

Schema

{
  "mount_dir": "string",
  "data": {
    "property1": "string",
    "property2": "string"
  }
}

Properties

NameTypeRequiredDescription
mount_dirstringtrueDir at which data is to be mounted
dataobjecttrueData to be mounted, the key will be the filename, and the value will be the file content. Files will be mounted under mount_dir

Autoscaling

Description

Describes the configuration for Autoscaling

Schema

{
  "min_replicas": 1,
  "max_replicas": 1,
  "metrics": {},
  "polling_interval": 30,
  "cooldown_period": 300
}

Properties

NameTypeRequiredRestrictionsDescription
min_replicasintegertruenoneMinimum number of replicas to keep available
max_replicasintegertruenoneMaximum number of replicas allowed for the component.
metrics[CPUUtilizationMetricRPSMetricCronMetric]true
polling_intervalintegertruenoneThis is the interval to check each trigger on.
cooldown_periodintegertruenoneThe period to wait after the last trigger reported active before scaling the resource back to 0.

CPUUtilizationMetric

Schema

{
  "type": "cpu_utilization",
  "value": 0
}

Properties

NameTypeRequiredRestrictionsDescription
typestringtruenonenone
valueintegertruenonePercentage of cpu request averaged over all replicas which the autoscaler should try to maintain

RPSMetric

Schema

{
  "type": "rps",
  "value": 1
}

Properties

NameTypeRequiredRestrictionsDescription
typestringtruenonenone
valueintegertruenoneAverage request per second averaged over all replicas that autoscaler should try to maintain

CronMetric

Schema

{
  "type": "cron",
  "desired_replicas": 1,
  "start": "string",
  "end": "string",
  "timezone": "UTC"
}

Properties

NameTypeRequiredRestrictionsDescription
typestringtruenonenone
desired_replicasintegerfalsenoneDesired number of replicas during the given interval. Default value is max_replicas.
startstringtruenoneCron expression indicating the start of the cron schedule.
endstringtruenoneCron expression indicating the end of the cron schedule.
timezonestringtruenoneTimezone against which the cron schedule will be calculated, e.g. "Asia/Tokyo". Default is machine's local time.
https://docs.truefoundry.com/docs/list-of-supported-timezones