Set Resources
When we deploy a service or job or any other application, we will need to define the resources that the application needs. This will ensure that Kubernetes can allot the correct set of resources and the application continues to run smoothly.
There are a few key resources you will need to configure:
Resource | Unit | Description |
---|---|---|
CPU | CPU cores | This defines the processing power required by the application in terms of CPU cores. 1 CPU unit is equivalent to 1 virtual core. It’s possible to ask for fractional CPU resources like 0.1 or even 0.01 (minimum is 0.01). While defining CPU, you will need to specify requests and limits. The number you define in the CPU request will always be reserved for the application. Your application can occasionaly take more CPU than what is requestes till the CPU limit, but not beyond that. Any value above the request is opportunistic and not guaranteed to be available. For e.g, if CPU request is 0.5 and limit is 1, it means that the application has 0.5 CPU reserved for itself. CPU usage can go upto 1 if there is spare CPU available on the machine - else it will be throttled. |
Memory | Megabytes (MB) | Defined as an integer and the unit is Megabytes. So a value of 1 means 1 MB of memory and 1000 means 1GB of memory. You need to specify memory requests and limits. The number you define in the memory request will always be reserved for the application. Your application can occasionaly take more memory than what is requested till the memory limit. Any value above the request is opportunistic and not guaranteed to be available. If the application takes up more memory than the limit, then the application will be killed and restarted. For e.g, if memory request is 500 MB and limit is 1000 MB, it means that you application will always have 500MB of RAM. You can have spikes of memory usage till 1 GB, beyond which the application will be killed and restarted. |
Ephemeral Storage | Megabytes (MB) | Temporary disk space to keep code, artifacts, etc which is lost when the pod is terminated. Defined as an integer and the unit is Megabytes. A value of 1 means 1 MB of disk space and 1000 means 1GB of disk space. You need to specify ephemeral Storage requests and limits. If you specify 1 GB as request and 5 GB as limit, you will have guaranteed access to 1GB of disk space. You can go upto 5GB in case there is disk left on the machine, but we shouldn’t rely on this. If the application tries to take up more than 5GB, the application will be killed. |
GPU | GPU-Type and Count | This allows you to specify which GPU you want to provision for your application (GPUs can be of the following types: K80, P4, P100, V100, T4, A10G, A100_40GB, A100_80GB, etc.). Secondly, you need to specify the GPU-Count. Please note that if you ask for GPU count as 2 and type as A100, you will get a machine with atleast 2 A100 GPU cards. Its possible in some cloud providers that one machine has 4 A100 GPU cards. In this case, your application will use 2 of the 4 GPU cards and another application can use the rest 2 cards. |
Shared Memory | Megabytes (MB) | Shared memory size is needed for data sharing between processes. This is useful in certain scenarios, for example, Torch Dataloaders rely on shared memory to efficiently share data between multiple workers during training. Defined as an integer and the unit is Megabytes. A value of 1 means 1 MB of shared memory and 1000 means 1GB of shared memory. In case your use-case requires Shared memory, and the usage exceeds the Shared memory size, your applications’ replica will get killed. |
Setting Resources for your application
You can select the resources from the Resources
section in the deployment form.
The key things to fill up here are:
- Choose CPU or GPU along with its type.
- Fill up the CPU, Memory and Ephemeral Storage request and limit fields.
Requests should be set based on the minimum resources your application needs to function normally under load. Requests are guaranteed to be allocated to your application.
Limits should be set to prevent the application from using more resources than is acceptable. Setting the limits higher than requests allows the application to burst if it requires more resources for a short duration. Limits define the maximum amount of resources a container is allowed to consume if there are extra resources available.
If you are unsure about the limit, its usually a good idea to set it to twice the request. So if you are CPU request is 0.5, you can set the limit to 1. If you are requesting 1 GB RAM, you can set the limit to 2 GB.
- Select the capacity type - we have the following options here:
On-demand
- This will deploy the application on on-demand instances.Spot
- This will deploy the application on spot instances.Prefer Spot (with fallback on On-demand)
- This will try to deploy the application on spot instances. If the spot instances are not available, it will bring up the application on on-demand instances. If a spot instance is preempted, we will try to replace it with another spot instance, failing which, it will be shifted to an on-demand instance.
We usually recommend using the Prefer Spot (with fallback on On-demand)
or Any
(in Azure) option for all development workloads.
For production services, if you are running more than 5 replicas of a service, we recommend using the Prefer Spot (with fallback on On-demand)
option. Else, we recommend using the On-demand
option.
For jobs, we recommend using the On-demand
option if your jobs are short (less than 3 hours). For long-running jobs, you can use the Prefer Spot (with fallback on On-demand)
option only if you implement checkpointing and can resume from the last checkpoint where the earlier job was killed because of spot instance preemption.
You almost never need to use the Spot
option.
Spot instances are a good option for nodes when:
- You are running multiple replicas of a service - hence the chances of all the spot machines going down at the same time is extremely low.
- You are running a long running training job and saving checkpoints which allows you to resume the training from the last checkpoint in case the machine is taken away.
- Your service can handle going down and restarting on a different node.
Advanced Options
Selecting a specific instance type
You can select an instance type by clicking on the Advanced Options
button in Resources section and then selecting Instance Family
option.
Selecting a Kubernetes nodepool
If you are using the nodepools in Kubernetes and want to specifically select a certain nodepool to deploy your application, you can do so by clicking on the Nodepool Selector
option and then selecting the nodepool. We automatically parse the possible options from the available nodepools to show you the filtered set of nodepools - for e.g. if you select Spot, it will filter the set of nodepools that have spot instances.