The start of the service will have a delay (cold start time) since it might first need to download the image and then start the service. This feature is most commonly used and best suited for dev environments.
Configuring Scale to 0
In the Service deployment form, enable theAdvanced Fields
toggle and then you will see the Auto Shutdown
section as shown below.

How does Scale to 0 work?
Scale to 0 is powered by the Elasti. Its an open-source Kubernetes-native solution built by Truefoundry that offers scale-to-zero functionality on Kubernetes when there is no traffic and automatic scale up to 0 when traffic arrives. A brief summary about Elasti is:Most Kubernetes autoscaling solutions like HPA or Keda can scale from 1 to n replicas based on cpu utilization or memory usage. However, these solutions do not offer a way to scale to 0 when there is no traffic. Elasti solves this problem by dynamically managing service replicas based on real-time traffic conditions. It only handles scaling the application down to 0 replicas and scaling it back up to 1 replica when traffic is detected again. The scaling after 1 replica is handled by the autoscaler like HPA or Keda.Elasti uses a proxy mechanism that queues and holds requests for scaled-down services, bringing them up only when needed. The proxy is used only when the service is scaled down to 0. When the service is scaled up to 1, the proxy is disabled and the requests are processed directly by the pods of the service.
