Deployments - Additional Configuration
Note
While using ServiceFoundry python SDK
type
is not a required field in any of the imported classes
For deployments we can use the modules below to add necessary functionalities
Resources
Description
Describes the resource constraints for the application so that it can be deployed accordingly on the cluster
To learn more you can go here
Schema
{
"cpu_request": 0.2,
"cpu_limit": 0.5,
"memory_request": 200,
"memory_limit": 500,
"ephemeral_storage_request": 1000,
"ephemeral_storage_limit": 2000,
"instance_family": [
"string"
]
}
Properties
Name | Type | Required | Description |
---|---|---|---|
cpu_request | number | true | Requested CPU which determines the minimum cost incurred. The CPU usage can exceed the requested amount, but not the value specified in the limit. 1 CPU means 1 CPU core. Fractional CPU can be requested like 0.5 or 0.05 |
cpu_limit | number | true | CPU limit beyond which the usage cannot be exceeded. 1 CPU means 1 CPU core. Fractional CPU can be requested like 0.5 . CPU limit should be >= cpu request. |
memory_request | number | true | Requested memory which determines the minimum cost incurred. The unit of memory is in megabytes(MB). So 1 means 1 MB and 2000 means 2GB. |
memory_limit | number | true | Memory limit after which the application will be killed with an OOM error. The unit of memory is in megabytes(MB). So 1 means 1 MB and 2000 means 2GB. MemoryLimit should be greater than memory request. |
ephemeral_storage_request | number | true | Requested disk storage. The unit of memory is in megabytes(MB). This is ephemeral storage and will be wiped out on pod restarts or eviction |
ephemeral_storage_limit | number | true | Disk storage limit. The unit of memory is in megabytes(MB). Exceeding this limit will result in eviction. It should be greater than the request. This is ephemeral storage and will be wiped out on pod restarts or eviction |
instance_family | [string] | false | Instance family of the underlying machine to use. Multiple instance families can be supplied. The workload is guaranteed to be scheduled on one of them. |
Python Examples
from servicefoundry import Service, Resources
service = Service( # or Job or ModelDeployment
...
resources=Resources(
cpu_request=1,
memory_request=1000, # in Megabytes
ephemeral_storage_request=1000, # in Megabytes
cpu_limit=4,
memory_limit=4000,
ephemeral_storage_limit=10000,
instance_family=["c6i", "t3", "m4"],
)
)
FileMount
Description
Describes the configuration for FileMount
Schema
{
"mount_dir": "string",
"data": {
"property1": "string",
"property2": "string"
}
}
Properties
Name | Type | Required | Description |
---|---|---|---|
mount_dir | string | true | Dir at which data is to be mounted |
data | object | true | Data to be mounted, the key will be the filename, and the value will be the file content. Files will be mounted under mount_dir |
Autoscaling
Description
Describes the configuration for Autoscaling
Schema
{
"min_replicas": 1,
"max_replicas": 1,
"metrics": {},
"polling_interval": 30,
"cooldown_period": 300
}
Properties
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
min_replicas | integer | true | none | Minimum number of replicas to keep available |
max_replicas | integer | true | none | Maximum number of replicas allowed for the component. |
metrics | [CPUUtilizationMetric | RPSMetric | CronMetric] | true |
polling_interval | integer | true | none | This is the interval to check each trigger on. |
cooldown_period | integer | true | none | The period to wait after the last trigger reported active before scaling the resource back to 0. |
CPUUtilizationMetric
Schema
{
"type": "cpu_utilization",
"value": 0
}
Properties
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
type | string | true | none | none |
value | integer | true | none | Percentage of cpu request averaged over all replicas which the autoscaler should try to maintain |
RPSMetric
Schema
{
"type": "rps",
"value": 1
}
Properties
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
type | string | true | none | none |
value | integer | true | none | Average request per second averaged over all replicas that autoscaler should try to maintain |
CronMetric
Schema
{
"type": "cron",
"desired_replicas": 1,
"start": "string",
"end": "string",
"timezone": "UTC"
}
Properties
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
type | string | true | none | none |
desired_replicas | integer | false | none | Desired number of replicas during the given interval. Default value is max_replicas. |
start | string | true | none | Cron expression indicating the start of the cron schedule. |
end | string | true | none | Cron expression indicating the end of the cron schedule. |
timezone | string | true | none | Timezone against which the cron schedule will be calculated, e.g. "Asia/Tokyo". Default is machine's local time. https://docs.truefoundry.com/docs/list-of-supported-timezones |
Updated 20 days ago