Deployment Additional Configuration
Note
While using Truefoundry python SDK
type
is not a required field in any of the imported classes
For deployments we can use the modules below to add necessary functionalities
Resources
Resources
Description
Describes the resource constraints for the application so that it can be deployed accordingly on the cluster
To learn more you can go here
Schema
{
"cpu_request": 0.2,
"cpu_limit": 0.5,
"memory_request": 200,
"memory_limit": 500,
"ephemeral_storage_request": 1000,
"ephemeral_storage_limit": 2000,
"gpu_count": 0,
"shared_memory_size": 64,
"node": {}
}
Properties
Name | Type | Required | Description |
---|---|---|---|
cpu_request | number | true | Requested CPU which determines the minimum cost incurred. The CPU usage can exceed the requested amount, but not the value specified in the limit. 1 CPU means 1 CPU core. Fractional CPU can be requested like 0.5 or 0.05 |
cpu_limit | number | true | CPU limit beyond which the usage cannot be exceeded. 1 CPU means 1 CPU core. Fractional CPU can be requested like 0.5 . CPU limit should be >= cpu request. |
memory_request | integer | true | Requested memory which determines the minimum cost incurred. The unit of memory is in megabytes(MB). So 1 means 1 MB and 2000 means 2GB. |
memory_limit | integer | true | Memory limit after which the application will be killed with an OOM error. The unit of memory is in megabytes(MB). So 1 means 1 MB and 2000 means 2GB. MemoryLimit should be greater than memory request. |
ephemeral_storage_request | integer | true | Requested disk storage. The unit of memory is in megabytes(MB). This is ephemeral storage and will be wiped out on pod restarts or eviction |
ephemeral_storage_limit | integer | true | Disk storage limit. The unit of memory is in megabytes(MB). Exceeding this limit will result in eviction. It should be greater than the request. This is ephemeral storage and will be wiped out on pod restarts or eviction |
gpu_count | integer | true | Count of GPUs to provide to the application Note the exact count and max count available for a given GPU type depends on cloud provider and cluster type. |
shared_memory_size | integer | false | Define the shared memory requirements for your workload. Machine learning libraries like Pytorch can use Shared Memory for inter-process communication. If you use this, we will mount a tmpfs backed volume at the /dev/shm directory.Any usage will also count against the workload's memory limit ( resources.memory_limit ) along with your workload's memory usage.If the overall usage goes above resources.memory_limit the user process may get killed.Shared Memory Size cannot be more than the defined Memory Limit for the workload. |
node | object | false | This field determines how the underlying node resource is to be utilized |
NodeSelector
Description
Constraints to select a Node - Specific GPU / Instance Families, On-Demand/Spot.
Schema
{
"type": "string",
"gpu_type": "string",
"instance_families": [
"string"
],
"capacity_type": "spot_fallback_on_demand"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=node_selector |
gpu_type | string | false | Name of the Nvidia GPU. One of [K80, P4, P100, V100, T4, A10G, A100_40GB, A100_80GB] One instance of the card contains the following amount of memory - K80: 12 GB, P4: 8 GB, P100: 16 GB, V100: 16 GB, T4: 16 GB, A10G: 24 GB, A100_40GB: 40GB, A100_80GB: 80 GB |
instance_families | [string] | false | Instance family of the underlying machine to use. Multiple instance families can be supplied. The workload is guaranteed to be scheduled on one of them. |
capacity_type | string | false | Configure what type of nodes to run the app. By default no placement logic is applied. "spot_fallback_on_demand" will try to place the application on spot nodes but will fallback to on-demand when spot nodes are not available. "spot" will strictly place the application on spot nodes. "on_demand" will strictly place the application on on-demand nodes. |
Enumerate Values
Property | Value |
---|---|
capacity_type | spot_fallback_on_demand |
capacity_type | spot |
capacity_type | on_demand |
NodepoolSelector
Description
Specify one or more nodepools to run your application on.
Schema
{
"type": "string",
"nodepools": [
"string"
]
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=nodepool_selector |
nodepools | [string] | false | Nodepools where you want to run your workload. Multiple nodepools can be selected. The workload is guaranteed to be scheduled on one of the nodepool |
Mounts
VolumeMount
Description
Describes a volume mount
Schema
{
"type": "string",
"mount_path": "string",
"volume_fqn": "string"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=volume |
mount_path | string | true | Absolute file path where the volume will be mounted. |
volume_fqn | string | true | The Truefoundry volume that needs to be mounted. |
SecretMount
Description
Describes a secret mount
Schema
{
"type": "string",
"mount_path": "string",
"secret_fqn": "string"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=secret |
mount_path | string | true | Absolute file path where the file will be created. |
secret_fqn | string | true | The Truefoundry secret whose value will be the file content. |
StringDataMount
Description
Describes a string data mount
Schema
{
"type": "string",
"mount_path": "string",
"data": "string"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=string |
mount_path | string | true | Absolute file path where the file will be created. |
data | string | true | The file content. |
Updated 6 months ago