Additional Configurations
Customize your job with advanced options
Adding environment variables and secrets
The Environment Variables and Secrets concept pages cover how to create and use them. Here we just show the usage for quick reference.
from servicefoundry import Job, Build, PythonBuild
job = Job(
name="iris-train-job",
image=Build(
build_spec=PythonBuild(
command="python train.py",
requirements_path="requirements.txt",
)
),
env={
"DB_USER": "postgres",
"DB_PASSWORD": "<YOUR_SECRET_FQN>"
}
)
name: iris-train-job
type: job
image:
type: build
build_source:
type: local
build_spec:
type: tfy-python-buildpack
command: python train.py
requirements_path: requirements.txt
env:
DB_USER: "postgres"
DB_PASSWORD: "tfy-secret://<YOUR_SECRET_FQN>"
Configure concurrency_limit
concurrency_limit
Requires
servicefoundry>=0.9.8
You can limit how many runs of a Job can run concurrently (provided sufficient resources are available in the cluster). By default, this is set to None
which allows an unlimited number of runs concurrently.
E.g. To allow only no more than 3 runs to run concurrently:
from servicefoundry import Job, Build, PythonBuild
job = Job(
name="iris-train-job",
image=Build(
build_spec=PythonBuild(
command="python train.py",
requirements_path="requirements.txt",
)
),
concurrency_limit=3
)
name: iris-train-job
type: job
image:
type: build
build_source:
type: local
build_spec:
type: tfy-python-buildpack
command: python train.py
requirements_path: requirements.txt
concurrency_limit: 3
Configure retries
retries
You can specify the maximum number of attempts to run a Job before it is marked as failed. By default, this is set to 1
from servicefoundry import Job, Build, PythonBuild
job = Job(
name="iris-train-job",
image=Build(
build_spec=PythonBuild(
command="python train.py",
requirements_path="requirements.txt",
)
),
retries=3
)
name: iris-train-job
type: job
image:
type: build
build_source:
type: local
build_spec:
type: tfy-python-buildpack
command: python train.py
requirements_path: requirements.txt
retries: 3
Configure timeout
timeout
You can specify (in seconds) the maximum amount of time for a job to run, whether it has failed or not. This will take precedence over the retries
Retry Limit. By default, this is set to 1000
seconds.
For example, if you set theretries
to 6
and a timeout
of 480
seconds, the job will terminate after 8 minutes regardless of how many times it attempted to run.
from servicefoundry import Job, Build, PythonBuild
job = Job(
name="iris-train-job",
image=Build(
build_spec=PythonBuild(
command="python train.py",
requirements_path="requirements.txt",
)
),
retries=6,
timeout=480,
)
name: iris-train-job
type: job
image:
type: build
build_source:
type: local
build_spec:
type: tfy-python-buildpack
command: python train.py
requirements_path: requirements.txt
retries: 6
timeout: 480
Set resources limits
You can configure the CPU and Memory resources to be allocated to each job. To understand the resources configuration in more detail, please read the Resources concepts page.
For e.g. here we request 0.2
CPU and 128
MiB memory with a hard limit of 0.5
CPU and 512
MiB memory.
from servicefoundry import Job, Build, PythonBuild, Resources
job = Job(
name="iris-train-job",
image=Build(
build_spec=PythonBuild(
command="python train.py",
requirements_path="requirements.txt",
)
),
resources=Resources(
cpu_request=0.2,
cpu_limit=0.5,
memory_request=128,
memory_limit=512,
),
)
name: iris-train-job
type: job
image:
type: build
build_source:
type: local
build_spec:
type: tfy-python-buildpack
command: python train.py
requirements_path: requirements.txt
resources:
cpu_request: 0.2
cpu_limit: 0.5
memory_request: 128
memory_limit: 512
Updated 5 days ago