Reference tasks

Tags: Intermediate

A flytekit.reference_task() references the Flyte tasks that have already been defined, serialized, and registered. You can reference tasks from other projects and create workflows that use tasks declared by others. These tasks can be in their own containers, python runtimes, flytekit versions, and even different languages.

The following example illustrates how to use reference tasks.

Note

Reference tasks cannot be run locally. You must mock them out.

Note

To clone and run the example code on this page, see the Flytesnacks repo.

productionizing/reference_task.py
from typing import List

from flytekit import reference_task, workflow
from flytekit.types.file import FlyteFile

# A `flytekit.reference_task` references the Flyte tasks that have already been defined, serialized, and registered.
# You can reference tasks from other projects and create workflows that use tasks declared by others.
# These tasks can be in their own containers, python runtimes, flytekit versions, and even different languages.

# The following example illustrates how to use reference tasks
# Note that reference tasks cannot be run locally. You must mock them out


@reference_task(
    project="flytesnacks",
    domain="development",
    name="data_types_and_io.file.normalize_columns",
    version="{{ registration.version }}",
)
def normalize_columns(
    csv_url: FlyteFile,
    column_names: List[str],
    columns_to_normalize: List[str],
    output_location: str,
) -> FlyteFile:
    ...


@workflow
def wf() -> FlyteFile:
    return normalize_columns(
        csv_url="https://people.sc.fsu.edu/~jburkardt/data/csv/biostats.csv",
        column_names=["Name", "Sex", "Age", "Heights (in)", "Weight (lbs)"],
        columns_to_normalize=["Age"],
        output_location="",
    )

Note

The macro {{ registration.version }} is populated by flytectl register during registration. Generally, it is unnecessary for reference tasks, as it is preferable to bind to a specific version of the task or launch plan. However, in this example, we are registering both the task core.flyte_basics.files.normalize_columns and the workflow that references it. Therefore, we need the macro to be updated to the version of a specific Flytesnacks release. This is why {{ registration.version }} is used.

A typical reference task would resemble the following:

@reference_task(
     project="flytesnacks",
     domain="development",
     name="core.flyte_basics.files.normalize_columns",
     version="d06cebcfbeabc02b545eefa13a01c6ca992940c8", # If using GIT for versioning OR 0.16.0, if semver
 )
 def normalize_columns(...):
     ...