Protocol Documentation#

flyteidl/event/event.proto#

DynamicWorkflowNodeMetadata#

For dynamic workflow nodes we send information about the dynamic workflow definition that gets generated.

DynamicWorkflowNodeMetadata type fields#

Field

Type

Label

Description

id

Identifier

id represents the unique identifier of the workflow.

compiled_workflow

CompiledWorkflowClosure

Represents the compiled representation of the embedded dynamic workflow.

ExternalResourceInfo#

This message contains metadata about external resources produced or used by a specific task execution.

ExternalResourceInfo type fields#

Field

Type

Label

Description

external_id

string

Identifier for an external resource created by this task execution, for example Qubole query ID or presto query ids.

index

uint32

A unique index for the external resource with respect to all external resources for this task. Although the identifier may change between task reporting events or retries, this will remain the same to enable aggregating information from multiple reports.

retry_attempt

uint32

Retry attempt number for this external resource, ie., 2 for the second attempt

phase

TaskExecution.Phase

Phase associated with the external resource

cache_status

CatalogCacheStatus

Captures the status of caching for this external resource execution.

logs

TaskLog

repeated

log information for the external resource execution

NodeExecutionEvent#

NodeExecutionEvent type fields#

Field

Type

Label

Description

id

NodeExecutionIdentifier

Unique identifier for this node execution

producer_id

string

the id of the originator (Propeller) of the event

phase

NodeExecution.Phase

occurred_at

Timestamp

This timestamp represents when the original event occurred, it is generated by the executor of the node.

input_uri

string

output_uri

string

URL to the output of the execution, it encodes all the information including Cloud source provider. ie., s3://…

error

ExecutionError

Error information for the execution

output_data

LiteralMap

Raw output data produced by this node execution.

workflow_node_metadata

WorkflowNodeMetadata

task_node_metadata

TaskNodeMetadata

parent_task_metadata

ParentTaskExecutionMetadata

[To be deprecated] Specifies which task (if any) launched this node.

parent_node_metadata

ParentNodeExecutionMetadata

Specifies the parent node of the current node execution. Node executions at level zero will not have a parent node.

retry_group

string

Retry group to indicate grouping of nodes by retries

spec_node_id

string

Identifier of the node in the original workflow/graph This maps to value of WorkflowTemplate.nodes[X].id

node_name

string

Friendly readable name for the node

event_version

int32

is_parent

bool

Whether this node launched a subworkflow.

is_dynamic

bool

Whether this node yielded a dynamic workflow.

deck_uri

string

String location uniquely identifying where the deck HTML file is NativeUrl specifies the url in the format of the configured storage provider (e.g. s3://my-bucket/randomstring/suffix.tar)

ParentNodeExecutionMetadata#

ParentNodeExecutionMetadata type fields#

Field

Type

Label

Description

node_id

string

Unique identifier of the parent node id within the execution This is value of core.NodeExecutionIdentifier.node_id of the parent node

ParentTaskExecutionMetadata#

ParentTaskExecutionMetadata type fields#

Field

Type

Label

Description

id

TaskExecutionIdentifier

ResourcePoolInfo#

This message holds task execution metadata specific to resource allocation used to manage concurrent executions for a project namespace.

ResourcePoolInfo type fields#

Field

Type

Label

Description

allocation_token

string

Unique resource ID used to identify this execution when allocating a token.

namespace

string

Namespace under which this task execution requested an allocation token.

TaskExecutionEvent#

Plugin specific execution event information. For tasks like Python, Hive, Spark, DynamicJob.

TaskExecutionEvent type fields#

Field

Type

Label

Description

task_id

Identifier

ID of the task. In combination with the retryAttempt this will indicate the task execution uniquely for a given parent node execution.

parent_node_execution_id

NodeExecutionIdentifier

A task execution is always kicked off by a node execution, the event consumer will use the parent_id to relate the task to it’s parent node execution

retry_attempt

uint32

retry attempt number for this task, ie., 2 for the second attempt

phase

TaskExecution.Phase

Phase associated with the event

producer_id

string

id of the process that sent this event, mainly for trace debugging

logs

TaskLog

repeated

log information for the task execution

occurred_at

Timestamp

This timestamp represents when the original event occurred, it is generated by the executor of the task.

input_uri

string

URI of the input file, it encodes all the information including Cloud source provider. ie., s3://…

output_uri

string

URI to the output of the execution, it will be in a format that encodes all the information including Cloud source provider. ie., s3://…

error

ExecutionError

Error information for the execution

output_data

LiteralMap

Raw output data produced by this task execution.

custom_info

Struct

Custom data that the task plugin sends back. This is extensible to allow various plugins in the system.

phase_version

uint32

Some phases, like RUNNING, can send multiple events with changed metadata (new logs, additional custom_info, etc) that should be recorded regardless of the lack of phase change. The version field should be incremented when metadata changes across the duration of an individual phase.

reason

string

An optional explanation for the phase transition.

task_type

string

A predefined yet extensible Task type identifier. If the task definition is already registered in flyte admin this type will be identical, but not all task executions necessarily use pre-registered definitions and this type is useful to render the task in the UI, filter task executions, etc.

metadata

TaskExecutionMetadata

Metadata around how a task was executed.

event_version

int32

The event version is used to indicate versioned changes in how data is reported using this proto message. For example, event_verison > 0 means that maps tasks report logs using the TaskExecutionMetadata ExternalResourceInfo fields for each subtask rather than the TaskLog in this message.

TaskExecutionMetadata#

Holds metadata around how a task was executed. As a task transitions across event phases during execution some attributes, such its generated name, generated external resources, and more may grow in size but not change necessarily based on the phase transition that sparked the event update. Metadata is a container for these attributes across the task execution lifecycle.

TaskExecutionMetadata type fields#

Field

Type

Label

Description

generated_name

string

Unique, generated name for this task execution used by the backend.

external_resources

ExternalResourceInfo

repeated

Additional data on external resources on other back-ends or platforms (e.g. Hive, Qubole, etc) launched by this task execution.

resource_pool_info

ResourcePoolInfo

repeated

Includes additional data on concurrent resource management used during execution.. This is a repeated field because a plugin can request multiple resource allocations during execution.

plugin_identifier

string

The identifier of the plugin used to execute this task.

instance_class

TaskExecutionMetadata.InstanceClass

TaskNodeMetadata#

TaskNodeMetadata type fields#

Field

Type

Label

Description

cache_status

CatalogCacheStatus

Captures the status of caching for this execution.

catalog_key

CatalogMetadata

This structure carries the catalog artifact information

reservation_status

CatalogReservation.Status

Captures the status of cache reservations for this execution.

checkpoint_uri

string

The latest checkpoint location

dynamic_workflow

DynamicWorkflowNodeMetadata

In the case this task launched a dynamic workflow we capture its structure here.

WorkflowExecutionEvent#

WorkflowExecutionEvent type fields#

Field

Type

Label

Description

execution_id

WorkflowExecutionIdentifier

Workflow execution id

producer_id

string

the id of the originator (Propeller) of the event

phase

WorkflowExecution.Phase

occurred_at

Timestamp

This timestamp represents when the original event occurred, it is generated by the executor of the workflow.

output_uri

string

URL to the output of the execution, it encodes all the information including Cloud source provider. ie., s3://…

error

ExecutionError

Error information for the execution

output_data

LiteralMap

Raw output data produced by this workflow execution.

WorkflowNodeMetadata#

For Workflow Nodes we need to send information about the workflow that’s launched

WorkflowNodeMetadata type fields#

Field

Type

Label

Description

execution_id

WorkflowExecutionIdentifier

TaskExecutionMetadata.InstanceClass#

Includes the broad category of machine used for this specific task execution.

Enum TaskExecutionMetadata.InstanceClass values#

Name

Number

Description

DEFAULT

0

The default instance class configured for the flyte application platform.

INTERRUPTIBLE

1

The instance class configured for interruptible tasks.

google/protobuf/timestamp.proto#

Timestamp#

A Timestamp represents a point in time independent of any time zone or local calendar, encoded as a count of seconds and fractions of seconds at nanosecond resolution. The count is relative to an epoch at UTC midnight on January 1, 1970, in the proleptic Gregorian calendar which extends the Gregorian calendar backwards to year one.

All minutes are 60 seconds long. Leap seconds are “smeared” so that no leap second table is needed for interpretation, using a [24-hour linear smear](https://developers.google.com/time/smear).

The range is from 0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z. By restricting to that range, we ensure that we can convert to and from [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) date strings.

# Examples

Example 1: Compute Timestamp from POSIX time().

Timestamp timestamp; timestamp.set_seconds(time(NULL)); timestamp.set_nanos(0);

Example 2: Compute Timestamp from POSIX gettimeofday().

struct timeval tv; gettimeofday(&tv, NULL);

Timestamp timestamp; timestamp.set_seconds(tv.tv_sec); timestamp.set_nanos(tv.tv_usec * 1000);

Example 3: Compute Timestamp from Win32 GetSystemTimeAsFileTime().

FILETIME ft; GetSystemTimeAsFileTime(&ft); UINT64 ticks = (((UINT64)ft.dwHighDateTime) << 32) | ft.dwLowDateTime;

// A Windows tick is 100 nanoseconds. Windows epoch 1601-01-01T00:00:00Z // is 11644473600 seconds before Unix epoch 1970-01-01T00:00:00Z. Timestamp timestamp; timestamp.set_seconds((INT64) ((ticks / 10000000) - 11644473600LL)); timestamp.set_nanos((INT32) ((ticks % 10000000) * 100));

Example 4: Compute Timestamp from Java System.currentTimeMillis().

long millis = System.currentTimeMillis();

Timestamp timestamp = Timestamp.newBuilder().setSeconds(millis / 1000)

.setNanos((int) ((millis % 1000) * 1000000)).build();

Example 5: Compute Timestamp from Java Instant.now().

Instant now = Instant.now();

Timestamp timestamp =
Timestamp.newBuilder().setSeconds(now.getEpochSecond())

.setNanos(now.getNano()).build();

Example 6: Compute Timestamp from current time in Python.

timestamp = Timestamp() timestamp.GetCurrentTime()

# JSON Mapping

In JSON format, the Timestamp type is encoded as a string in the [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format. That is, the format is “{year}-{month}-{day}T{hour}:{min}:{sec}[.{frac_sec}]Z” where {year} is always expressed using four digits while {month}, {day}, {hour}, {min}, and {sec} are zero-padded to two digits each. The fractional seconds, which can go up to 9 digits (i.e. up to 1 nanosecond resolution), are optional. The “Z” suffix indicates the timezone (“UTC”); the timezone is required. A proto3 JSON serializer should always use UTC (as indicated by “Z”) when printing the Timestamp type and a proto3 JSON parser should be able to accept both UTC and other timezones (as indicated by an offset).

For example, “2017-01-15T01:30:15.01Z” encodes 15.01 seconds past 01:30 UTC on January 15, 2017.

In JavaScript, one can convert a Date object to this format using the standard [toISOString()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toISOString) method. In Python, a standard datetime.datetime object can be converted to this format using [strftime](https://docs.python.org/2/library/time.html#time.strftime) with the time format spec ‘%Y-%m-%dT%H:%M:%S.%fZ’. Likewise, in Java, one can use the Joda Time’s [ISODateTimeFormat.dateTime()]( http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTime%2D%2D ) to obtain a formatter capable of generating timestamps in this format.

Timestamp type fields#

Field

Type

Label

Description

seconds

int64

Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z. Must be from 0001-01-01T00:00:00Z to 9999-12-31T23:59:59Z inclusive.

nanos

int32

Non-negative fractions of a second at nanosecond resolution. Negative second values with fractions must still have non-negative nanos values that count forward in time. Must be from 0 to 999,999,999 inclusive.

google/protobuf/duration.proto#

Duration#

A Duration represents a signed, fixed-length span of time represented as a count of seconds and fractions of seconds at nanosecond resolution. It is independent of any calendar and concepts like “day” or “month”. It is related to Timestamp in that the difference between two Timestamp values is a Duration and it can be added or subtracted from a Timestamp. Range is approximately +-10,000 years.

# Examples

Example 1: Compute Duration from two Timestamps in pseudo code.

Timestamp start = …; Timestamp end = …; Duration duration = …;

duration.seconds = end.seconds - start.seconds; duration.nanos = end.nanos - start.nanos;

if (duration.seconds < 0 && duration.nanos > 0) {

duration.seconds += 1; duration.nanos -= 1000000000;

} else if (duration.seconds > 0 && duration.nanos < 0) {

duration.seconds -= 1; duration.nanos += 1000000000;

}

Example 2: Compute Timestamp from Timestamp + Duration in pseudo code.

Timestamp start = …; Duration duration = …; Timestamp end = …;

end.seconds = start.seconds + duration.seconds; end.nanos = start.nanos + duration.nanos;

if (end.nanos < 0) {

end.seconds -= 1; end.nanos += 1000000000;

} else if (end.nanos >= 1000000000) {

end.seconds += 1; end.nanos -= 1000000000;

}

Example 3: Compute Duration from datetime.timedelta in Python.

td = datetime.timedelta(days=3, minutes=10) duration = Duration() duration.FromTimedelta(td)

# JSON Mapping

In JSON format, the Duration type is encoded as a string rather than an object, where the string ends in the suffix “s” (indicating seconds) and is preceded by the number of seconds, with nanoseconds expressed as fractional seconds. For example, 3 seconds with 0 nanoseconds should be encoded in JSON format as “3s”, while 3 seconds and 1 nanosecond should be expressed in JSON format as “3.000000001s”, and 3 seconds and 1 microsecond should be expressed in JSON format as “3.000001s”.

Duration type fields#

Field

Type

Label

Description

seconds

int64

Signed seconds of the span of time. Must be from -315,576,000,000 to +315,576,000,000 inclusive. Note: these bounds are computed from: 60 sec/min * 60 min/hr * 24 hr/day * 365.25 days/year * 10000 years

nanos

int32

Signed fractions of a second at nanosecond resolution of the span of time. Durations less than one second are represented with a 0 seconds field and a positive or negative nanos field. For durations of one second or more, a non-zero value for the nanos field must be of the same sign as the seconds field. Must be from -999,999,999 to +999,999,999 inclusive.

google/protobuf/struct.proto#

ListValue#

ListValue is a wrapper around a repeated field of values.

The JSON representation for ListValue is JSON array.

ListValue type fields#

Field

Type

Label

Description

values

Value

repeated

Repeated field of dynamically typed values.

Struct#

Struct represents a structured data value, consisting of fields which map to dynamically typed values. In some languages, Struct might be supported by a native representation. For example, in scripting languages like JS a struct is represented as an object. The details of that representation are described together with the proto support for the language.

The JSON representation for Struct is JSON object.

Struct type fields#

Field

Type

Label

Description

fields

Struct.FieldsEntry

repeated

Unordered map of dynamically typed values.

Struct.FieldsEntry#

Struct.FieldsEntry type fields#

Field

Type

Label

Description

key

string

value

Value

Value#

Value represents a dynamically typed value which can be either null, a number, a string, a boolean, a recursive struct value, or a list of values. A producer of value is expected to set one of these variants. Absence of any variant indicates an error.

The JSON representation for Value is JSON value.

Value type fields#

Field

Type

Label

Description

null_value

NullValue

Represents a null value.

number_value

double

Represents a double value.

string_value

string

Represents a string value.

bool_value

bool

Represents a boolean value.

struct_value

Struct

Represents a structured value.

list_value

ListValue

Represents a repeated Value.

NullValue#

NullValue is a singleton enumeration to represent the null value for the Value type union.

The JSON representation for NullValue is JSON null.

Enum NullValue values#

Name

Number

Description

NULL_VALUE

0

Null value.