Creating a Map Task

A Map Task is a specialized task that enables the parallel execution of a single task across multiple inputs. It allows you to apply a task to each element of an input collection (like a list or array) and execute them in parallel. This is particularly useful when you need to perform the same operation on a large dataset or a collection of items, and you want to distribute the workload across multiple tasks

Here’s a simple example of how you might define and use a map task. Create a file name workflow.py and place it in your project directory.

from functools import partial
from truefoundry.workflow import task, workflow, map_task, PythonTaskConfig, TaskPythonBuild


task_config = PythonTaskConfig(image=TaskPythonBuild(
        python_version="3.11",
        pip_packages=["truefoundry[workflow]==0.4.8"],
    )
)

@task(task_config=task_config)
def square(x: int) -> int:
    return x * x

@workflow
def my_map_workflow(numbers: list[int]) -> list[int]:
    square_task = partial(square)
    squared_array = map_task(square_task)(x=numbers)
    print(f"Square of {numbers} is {squared_array}")
    return squared_array

Now run the below command in the terminal to deploy your workflow, replace <workfspace-fqn> with the workspace fqn which you can find on the UI.

tfy deploy workflow \
  --name map-task-test \
  --file workflow.py \
  --workspace_fqn "Paste your workspace FQN here"

In this example:

  • square: This is a simple task that squares a given integer.
  • map_task: The map_task function is used to apply the square task to each element of the numbers list.
  • my_map_workflow: This workflow demonstrates how to use the map task to process a list of numbers in parallel.

Map task on ui