Configuration

Flytekit Configuration Sources

There are multiple ways to configure flytekit settings:

Command Line Arguments: This is the recommended way of setting configuration values for many cases. For example, see pyflyte package command.

Python Config Object: A Config object can by used directly, e.g. when initializing a FlyteRemote object. See here for examples on how to specify a Config object.

Environment Variables: Users can specify these at compile time, but when your task is run, Flyte Propeller will also set configuration to ensure correct interaction with the platform. The environment variables must be specified with the format FLYTE_{SECTION}_{OPTION}, all in upper case. For example, to specify the PlatformConfig.endpoint setting, the environment variable would be FLYTE_PLATFORM_URL.

Note

Environment variables won’t work for image configuration, which need to be specified with the pyflyte package –image … option or in a configuration file.

YAML Format Configuration File: A configuration file that contains settings for both flytectl and flytekit. This is the recommended configuration file format. Invoke the flytectl config init command to create a boilerplate ~/.flyte/config.yaml file, and flytectl --help to learn about all of the configuration yaml options.

See example config.yaml file
config.yaml
admin:
  # For GRPC endpoints you might want to use dns:///flyte.myexample.com
  endpoint: dns:///flyte.mycorp.io
  authType: Pkce
  insecure: true
  clientId: propeller
  scopes:
    - all
storage:
  connection:
    access-key: minio
    endpoint: http://localhost:30084
    secret-key: miniostorage
images:
  xyz: docker.io/xyz:latest
  abc: docker.io/abc
  bcd: docker.io/bcd@sha256:26c68657ccce2cb0a31b330cb0hu3b5e108d467f641c62e13ab40cbec258c68d

INI Format Configuration File: A configuration file for flytekit. By default, flytekit will look for a file in two places:

  1. First, a file named flytekit.config in the Python interpreter’s working directory.

  2. A file in ~/.flyte/config in the home directory as detected by Python.

See example flytekit.config file
flytekit.config
[sdk]
workflow_packages=module1,module2

[platform]
url=flyte.mycorp.io
insecure=true

[auth]
kubernetes_service_account=demo
raw_output_data_prefix=s3://my-bucket

[images]
xyz=docker.io/xyz:latest
abc=docker.io/abc

Warning

The INI format configuration is considered a legacy configuration format. We recommend using the yaml format instead if you’re using a configuration file.

How is configuration used?

Configuration usage can roughly be bucketed into the following areas,

  • Compile-time settings: these are settings like the default image and named images, where to look for Flyte code, etc.

  • Platform settings: Where to find the Flyte backend (Admin DNS, whether to use SSL)

  • Registration Run-time settings: these are things like the K8s service account to use, a specific S3/GCS bucket to write off-loaded data (dataframes and files) to, notifications, labels & annotations, etc.

  • Data access settings: Is there a custom S3 endpoint in use? Backoff/retry behavior for accessing S3/GCS, key and password, etc.

  • Other settings - Statsd configuration, which is a run-time applicable setting but is not necessarily relevant to the Flyte platform.

Configuration Objects

The following objects are encapsulated in a parent object called Config.

Config

This the parent configuration object and holds all the underlying configuration object types.

Serialization Time Settings

These are serialization/compile-time settings that are used when using commands like pyflyte package or pyflyte register. These configuration settings are typically passed in as flags to the above CLI commands.

The image configurations are typically either passed in via an –image flag, or can be specified in the yaml or ini configuration files (see examples above).

Image

Image is a structured wrapper for task container images used in object serialization.

ImageConfig

We recommend you to use ImageConfig.auto(img_name=None) to create an ImageConfig.

SerializationSettings

These settings are provided while serializing a workflow and task, before registration.

FastSerializationSettings

This object hold information about settings necessary to serialize an object so that it can be fast-registered.

Execution Time Settings

Users typically shouldn’t be concerned with these configurations, as they are typically set by FlytePropeller or FlyteAdmin. The configurations below are useful for authenticating to a Flyte backend, configuring data access credentials, secrets, and statsd metrics.

PlatformConfig

This object contains the settings to talk to a Flyte backend (the DNS location of your Admin server basically).

StatsConfig

Configuration for sending statsd.

SecretsConfig

Configuration for secrets.

S3Config

S3 specific configuration

GCSConfig

Any GCS specific configuration.

DataConfig

Any data storage specific configuration.