Protocol Documentation

flyteidl/plugins/array_job.proto

ArrayJob

Describes a job that can process independent pieces of data concurrently. Multiple copies of the runnable component will be executed concurrently.

ArrayJob type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

parallelism

int64

Defines the minimum number of instances to bring up concurrently at any given point. Note that this is an optimistic restriction and that, due to network partitioning or other failures, the actual number of currently running instances might be more. This has to be a positive number if assigned. Default value is size.

size

int64

Defines the number of instances to launch at most. This number should match the size of the input if the job requires processing of all input data. This has to be a positive number. In the case this is not defined, the back-end will determine the size at run-time by reading the inputs.

min_successes

int64

An absolute number of the minimum number of successful completions of subtasks. As soon as this criteria is met, the array job will be marked as successful and outputs will be computed. This has to be a non-negative number if assigned. Default value is size (if specified).

min_success_ratio

float

If the array job size is not known beforehand, the min_success_ratio can instead be used to determine when an array job can be marked successful.

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/mpi.proto

DistributedMPITrainingTask

MPI operator proposal https://github.com/kubeflow/community/blob/master/proposals/mpi-operator-proposal.md Custom proto for plugin that enables distributed training using https://github.com/kubeflow/mpi-operator

DistributedMPITrainingTask type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

num_workers

int32

number of worker spawned in the cluster for this job

num_launcher_replicas

int32

number of launcher replicas spawned in the cluster for this job The launcher pod invokes mpirun and communicates with worker pods through MPI.

slots

int32

number of slots per worker used in hostfile. The available slots (GPUs) in each pod.

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/presto.proto

PrestoQuery

This message works with the ‘presto’ task type in the SDK and is the object that will be in the ‘custom’ field of a Presto task’s TaskTemplate

PrestoQuery type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

routing_group

string

catalog

string

schema

string

statement

string

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/pytorch.proto

DistributedPyTorchTrainingTask

Custom proto for plugin that enables distributed training using https://github.com/kubeflow/pytorch-operator

DistributedPyTorchTrainingTask type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

workers

int32

number of worker replicas spawned in the cluster for this job

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/qubole.proto

HiveQuery

Defines a query to execute on a hive cluster.

HiveQuery type fields

Field

Type

Label

Description

query

string

timeout_sec

uint32

retryCount

uint32

HiveQueryCollection

Defines a collection of hive queries.

HiveQueryCollection type fields

Field

Type

Label

Description

queries

HiveQuery

repeated

QuboleHiveJob

This message works with the ‘hive’ task type in the SDK and is the object that will be in the ‘custom’ field of a hive task’s TaskTemplate

QuboleHiveJob type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

cluster_label

string

query_collection

HiveQueryCollection

Deprecated.

tags

string

repeated

query

HiveQuery

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/sidecar.proto

SidecarJob

A sidecar job brings up the desired pod_spec. The plugin executor is responsible for keeping the pod alive until the primary container terminates or the task itself times out.

SidecarJob type fields

Field

Type

Label

Description

pod_spec

ref_k8s.io.api.core.v1.PodSpec

primary_container_name

string

annotations

SidecarJob.AnnotationsEntry

repeated

Pod annotations

labels

SidecarJob.LabelsEntry

repeated

Pod labels

SidecarJob.AnnotationsEntry

SidecarJob.AnnotationsEntry type fields

Field

Type

Label

Description

key

string

value

string

SidecarJob.LabelsEntry

SidecarJob.LabelsEntry type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

key

string

value

string

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/spark.proto

SparkApplication

SparkJob

Custom Proto for Spark Plugin.

SparkJob type fields

Field

Type

Label

Description

applicationType

SparkApplication.Type

mainApplicationFile

string

mainClass

string

sparkConf

SparkJob.SparkConfEntry

repeated

hadoopConf

SparkJob.HadoopConfEntry

repeated

executorPath

string

Executor path for Python jobs.

SparkJob.HadoopConfEntry

SparkJob.HadoopConfEntry type fields

Field

Type

Label

Description

key

string

value

string

SparkJob.SparkConfEntry

SparkJob.SparkConfEntry type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

key

string

value

string

<!– end messages –>

SparkApplication.Type

Enum SparkApplication.Type values :header: “Name”, “Number”, “Description” :widths: auto

PYTHON

0

JAVA

1

SCALA

2

R

3

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/tensorflow.proto

DistributedTensorflowTrainingTask

Custom proto for plugin that enables distributed training using https://github.com/kubeflow/tf-operator

DistributedTensorflowTrainingTask type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

workers

int32

number of worker, ps, chief replicas spawned in the cluster for this job

ps_replicas

int32

PS -> Parameter server

chief_replicas

int32

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>

flyteidl/plugins/waitable.proto

Waitable

Represents an Execution that was launched and could be waited on.

Waitable type fields :header: “Field”, “Type”, “Label”, “Description” :widths: auto

wf_exec_id

WorkflowExecutionIdentifier

phase

WorkflowExecution.Phase

workflow_id

string

<!– end messages –>

<!– end enums –>

<!– end HasExtensions –>

<!– end services –>