Additional Configurations

Customize your job with advanced options

Adding environment variables and secrets

The Environment Variables and Secrets concept pages cover how to create and use them. Here we just show the usage for quick reference.

from servicefoundry import Job, Build, PythonBuild

job = Job(
    name="iris-train-job",
    image=Build(
        build_spec=PythonBuild(
            command="python train.py",
            requirements_path="requirements.txt",
        )
    ),
    env={
        "DB_USER": "postgres",
      	"DB_PASSWORD": "<YOUR_SECRET_FQN>"
    }
)
name: iris-train-job
type: job
image:
	type: build
	build_source:
		type: local
	build_spec:
		type: tfy-python-buildpack
		command: python train.py
		requirements_path: requirements.txt
env:
	DB_USER: "postgres"
	DB_PASSWORD: "tfy-secret://<YOUR_SECRET_FQN>"

Configure concurrency_limit

Requires servicefoundry>=0.9.8

You can limit how many runs of a Job can run concurrently (provided sufficient resources are available in the cluster). By default, this is set to None which allows an unlimited number of runs concurrently.

E.g. To allow only no more than 3 runs to run concurrently:

from servicefoundry import Job, Build, PythonBuild

job = Job(
    name="iris-train-job",
    image=Build(
        build_spec=PythonBuild(
            command="python train.py",
            requirements_path="requirements.txt",
        )
    ),
    concurrency_limit=3
)
name: iris-train-job
type: job
image:
	type: build
	build_source:
		type: local
	build_spec:
		type: tfy-python-buildpack
		command: python train.py
		requirements_path: requirements.txt
concurrency_limit: 3

Configure retries

You can specify the maximum number of attempts to run a Job before it is marked as failed. By default, this is set to 1

from servicefoundry import Job, Build, PythonBuild

job = Job(
    name="iris-train-job",
    image=Build(
        build_spec=PythonBuild(
            command="python train.py",
            requirements_path="requirements.txt",
        )
    ),
    retries=3
)
name: iris-train-job
type: job
image:
	type: build
	build_source:
		type: local
	build_spec:
		type: tfy-python-buildpack
		command: python train.py
		requirements_path: requirements.txt
retries: 3

Configure timeout

You can specify (in seconds) the maximum amount of time for a job to run, whether it has failed or not. This will take precedence over the retries Retry Limit. By default, this is set to 1000 seconds.

For example, if you set theretries to 6 and a timeout of 480 seconds, the job will terminate after 8 minutes regardless of how many times it attempted to run.

from servicefoundry import Job, Build, PythonBuild

job = Job(
    name="iris-train-job",
    image=Build(
        build_spec=PythonBuild(
            command="python train.py",
            requirements_path="requirements.txt",
        )
    ),
    retries=6,
    timeout=480,
)
name: iris-train-job
type: job
image:
	type: build
	build_source:
		type: local
	build_spec:
		type: tfy-python-buildpack
		command: python train.py
		requirements_path: requirements.txt
retries: 6
timeout: 480

Set resources limits

You can configure the CPU and Memory resources to be allocated to each job. To understand the resources configuration in more detail, please read the Resources concepts page.

For e.g. here we request 0.2 CPU and 128 MiB memory with a hard limit of 0.5 CPU and 512 MiB memory.

from servicefoundry import Job, Build, PythonBuild, Resources

job = Job(
    name="iris-train-job",
    image=Build(
        build_spec=PythonBuild(
            command="python train.py",
            requirements_path="requirements.txt",
        )
    ),
    resources=Resources(
        cpu_request=0.2,
        cpu_limit=0.5,
        memory_request=128,
        memory_limit=512,
    ),
)
name: iris-train-job
type: job
image:
	type: build
	build_source:
		type: local
	build_spec:
		type: tfy-python-buildpack
		command: python train.py
		requirements_path: requirements.txt
resources:
	cpu_request: 0.2
  cpu_limit: 0.5
  memory_request: 128
  memory_limit: 512