Protocol Documentation#
flyteidl/event/event.proto#
DynamicWorkflowNodeMetadata#
For dynamic workflow nodes we send information about the dynamic workflow definition that gets generated.
Field |
Type |
Label |
Description |
---|---|---|---|
id |
id represents the unique identifier of the workflow. |
||
compiled_workflow |
Represents the compiled representation of the embedded dynamic workflow. |
ExternalResourceInfo#
This message contains metadata about external resources produced or used by a specific task execution.
Field |
Type |
Label |
Description |
---|---|---|---|
external_id |
Identifier for an external resource created by this task execution, for example Qubole query ID or presto query ids. |
||
index |
A unique index for the external resource with respect to all external resources for this task. Although the identifier may change between task reporting events or retries, this will remain the same to enable aggregating information from multiple reports. |
||
retry_attempt |
Retry attempt number for this external resource, ie., 2 for the second attempt |
||
phase |
Phase associated with the external resource |
||
cache_status |
Captures the status of caching for this external resource execution. |
||
logs |
repeated |
log information for the external resource execution |
NodeExecutionEvent#
Field |
Type |
Label |
Description |
---|---|---|---|
id |
Unique identifier for this node execution |
||
producer_id |
the id of the originator (Propeller) of the event |
||
phase |
|||
occurred_at |
This timestamp represents when the original event occurred, it is generated by the executor of the node. |
||
input_uri |
|||
output_uri |
URL to the output of the execution, it encodes all the information including Cloud source provider. ie., s3://… |
||
error |
Error information for the execution |
||
output_data |
Raw output data produced by this node execution. |
||
workflow_node_metadata |
|||
task_node_metadata |
|||
parent_task_metadata |
[To be deprecated] Specifies which task (if any) launched this node. |
||
parent_node_metadata |
Specifies the parent node of the current node execution. Node executions at level zero will not have a parent node. |
||
retry_group |
Retry group to indicate grouping of nodes by retries |
||
spec_node_id |
Identifier of the node in the original workflow/graph This maps to value of WorkflowTemplate.nodes[X].id |
||
node_name |
Friendly readable name for the node |
||
event_version |
|||
is_parent |
Whether this node launched a subworkflow. |
||
is_dynamic |
Whether this node yielded a dynamic workflow. |
||
deck_uri |
String location uniquely identifying where the deck HTML file is NativeUrl specifies the url in the format of the configured storage provider (e.g. s3://my-bucket/randomstring/suffix.tar) |
ParentNodeExecutionMetadata#
ParentTaskExecutionMetadata#
Field |
Type |
Label |
Description |
---|---|---|---|
id |
ResourcePoolInfo#
This message holds task execution metadata specific to resource allocation used to manage concurrent executions for a project namespace.
TaskExecutionEvent#
Plugin specific execution event information. For tasks like Python, Hive, Spark, DynamicJob.
Field |
Type |
Label |
Description |
---|---|---|---|
task_id |
ID of the task. In combination with the retryAttempt this will indicate the task execution uniquely for a given parent node execution. |
||
parent_node_execution_id |
A task execution is always kicked off by a node execution, the event consumer will use the parent_id to relate the task to it’s parent node execution |
||
retry_attempt |
retry attempt number for this task, ie., 2 for the second attempt |
||
phase |
Phase associated with the event |
||
producer_id |
id of the process that sent this event, mainly for trace debugging |
||
logs |
repeated |
log information for the task execution |
|
occurred_at |
This timestamp represents when the original event occurred, it is generated by the executor of the task. |
||
input_uri |
URI of the input file, it encodes all the information including Cloud source provider. ie., s3://… |
||
output_uri |
URI to the output of the execution, it will be in a format that encodes all the information including Cloud source provider. ie., s3://… |
||
error |
Error information for the execution |
||
output_data |
Raw output data produced by this task execution. |
||
custom_info |
Custom data that the task plugin sends back. This is extensible to allow various plugins in the system. |
||
phase_version |
Some phases, like RUNNING, can send multiple events with changed metadata (new logs, additional custom_info, etc) that should be recorded regardless of the lack of phase change. The version field should be incremented when metadata changes across the duration of an individual phase. |
||
reason |
An optional explanation for the phase transition. |
||
task_type |
A predefined yet extensible Task type identifier. If the task definition is already registered in flyte admin this type will be identical, but not all task executions necessarily use pre-registered definitions and this type is useful to render the task in the UI, filter task executions, etc. |
||
metadata |
Metadata around how a task was executed. |
||
event_version |
The event version is used to indicate versioned changes in how data is reported using this proto message. For example, event_verison > 0 means that maps tasks report logs using the TaskExecutionMetadata ExternalResourceInfo fields for each subtask rather than the TaskLog in this message. |
TaskExecutionMetadata#
Holds metadata around how a task was executed. As a task transitions across event phases during execution some attributes, such its generated name, generated external resources, and more may grow in size but not change necessarily based on the phase transition that sparked the event update. Metadata is a container for these attributes across the task execution lifecycle.
Field |
Type |
Label |
Description |
---|---|---|---|
generated_name |
Unique, generated name for this task execution used by the backend. |
||
external_resources |
repeated |
Additional data on external resources on other back-ends or platforms (e.g. Hive, Qubole, etc) launched by this task execution. |
|
resource_pool_info |
repeated |
Includes additional data on concurrent resource management used during execution.. This is a repeated field because a plugin can request multiple resource allocations during execution. |
|
plugin_identifier |
The identifier of the plugin used to execute this task. |
||
instance_class |
TaskNodeMetadata#
Field |
Type |
Label |
Description |
---|---|---|---|
cache_status |
Captures the status of caching for this execution. |
||
catalog_key |
This structure carries the catalog artifact information |
||
reservation_status |
Captures the status of cache reservations for this execution. |
||
checkpoint_uri |
The latest checkpoint location |
||
dynamic_workflow |
In the case this task launched a dynamic workflow we capture its structure here. |
WorkflowExecutionEvent#
Field |
Type |
Label |
Description |
---|---|---|---|
execution_id |
Workflow execution id |
||
producer_id |
the id of the originator (Propeller) of the event |
||
phase |
|||
occurred_at |
This timestamp represents when the original event occurred, it is generated by the executor of the workflow. |
||
output_uri |
URL to the output of the execution, it encodes all the information including Cloud source provider. ie., s3://… |
||
error |
Error information for the execution |
||
output_data |
Raw output data produced by this workflow execution. |
WorkflowNodeMetadata#
For Workflow Nodes we need to send information about the workflow that’s launched
Field |
Type |
Label |
Description |
---|---|---|---|
execution_id |
TaskExecutionMetadata.InstanceClass#
Includes the broad category of machine used for this specific task execution.
Name |
Number |
Description |
---|---|---|
DEFAULT |
0 |
The default instance class configured for the flyte application platform. |
INTERRUPTIBLE |
1 |
The instance class configured for interruptible tasks. |
google/protobuf/timestamp.proto#
Timestamp#
A Timestamp represents a point in time independent of any time zone or local calendar, encoded as a count of seconds and fractions of seconds at nanosecond resolution. The count is relative to an epoch at UTC midnight on January 1, 1970, in the proleptic Gregorian calendar which extends the Gregorian calendar backwards to year one.
All minutes are 60 seconds long. Leap seconds are “smeared” so that no leap second table is needed for interpretation, using a [24-hour linear smear](https://developers.google.com/time/smear).
The range is from 0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z. By restricting to that range, we ensure that we can convert to and from [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) date strings.
# Examples
Example 1: Compute Timestamp from POSIX time().
Timestamp timestamp; timestamp.set_seconds(time(NULL)); timestamp.set_nanos(0);
Example 2: Compute Timestamp from POSIX gettimeofday().
struct timeval tv; gettimeofday(&tv, NULL);
Timestamp timestamp; timestamp.set_seconds(tv.tv_sec); timestamp.set_nanos(tv.tv_usec * 1000);
Example 3: Compute Timestamp from Win32 GetSystemTimeAsFileTime().
FILETIME ft; GetSystemTimeAsFileTime(&ft); UINT64 ticks = (((UINT64)ft.dwHighDateTime) << 32) | ft.dwLowDateTime;
// A Windows tick is 100 nanoseconds. Windows epoch 1601-01-01T00:00:00Z // is 11644473600 seconds before Unix epoch 1970-01-01T00:00:00Z. Timestamp timestamp; timestamp.set_seconds((INT64) ((ticks / 10000000) - 11644473600LL)); timestamp.set_nanos((INT32) ((ticks % 10000000) * 100));
Example 4: Compute Timestamp from Java System.currentTimeMillis().
long millis = System.currentTimeMillis();
- Timestamp timestamp = Timestamp.newBuilder().setSeconds(millis / 1000)
.setNanos((int) ((millis % 1000) * 1000000)).build();
Example 5: Compute Timestamp from Java Instant.now().
Instant now = Instant.now();
- Timestamp timestamp =
- Timestamp.newBuilder().setSeconds(now.getEpochSecond())
.setNanos(now.getNano()).build();
Example 6: Compute Timestamp from current time in Python.
timestamp = Timestamp() timestamp.GetCurrentTime()
# JSON Mapping
In JSON format, the Timestamp type is encoded as a string in the [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format. That is, the format is “{year}-{month}-{day}T{hour}:{min}:{sec}[.{frac_sec}]Z” where {year} is always expressed using four digits while {month}, {day}, {hour}, {min}, and {sec} are zero-padded to two digits each. The fractional seconds, which can go up to 9 digits (i.e. up to 1 nanosecond resolution), are optional. The “Z” suffix indicates the timezone (“UTC”); the timezone is required. A proto3 JSON serializer should always use UTC (as indicated by “Z”) when printing the Timestamp type and a proto3 JSON parser should be able to accept both UTC and other timezones (as indicated by an offset).
For example, “2017-01-15T01:30:15.01Z” encodes 15.01 seconds past 01:30 UTC on January 15, 2017.
In JavaScript, one can convert a Date object to this format using the standard [toISOString()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toISOString) method. In Python, a standard datetime.datetime object can be converted to this format using [strftime](https://docs.python.org/2/library/time.html#time.strftime) with the time format spec ‘%Y-%m-%dT%H:%M:%S.%fZ’. Likewise, in Java, one can use the Joda Time’s [ISODateTimeFormat.dateTime()]( http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTime%2D%2D ) to obtain a formatter capable of generating timestamps in this format.
Field |
Type |
Label |
Description |
---|---|---|---|
seconds |
Represents seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z. Must be from 0001-01-01T00:00:00Z to 9999-12-31T23:59:59Z inclusive. |
||
nanos |
Non-negative fractions of a second at nanosecond resolution. Negative second values with fractions must still have non-negative nanos values that count forward in time. Must be from 0 to 999,999,999 inclusive. |
google/protobuf/duration.proto#
Duration#
A Duration represents a signed, fixed-length span of time represented as a count of seconds and fractions of seconds at nanosecond resolution. It is independent of any calendar and concepts like “day” or “month”. It is related to Timestamp in that the difference between two Timestamp values is a Duration and it can be added or subtracted from a Timestamp. Range is approximately +-10,000 years.
# Examples
Example 1: Compute Duration from two Timestamps in pseudo code.
Timestamp start = …; Timestamp end = …; Duration duration = …;
duration.seconds = end.seconds - start.seconds; duration.nanos = end.nanos - start.nanos;
- if (duration.seconds < 0 && duration.nanos > 0) {
duration.seconds += 1; duration.nanos -= 1000000000;
- } else if (duration.seconds > 0 && duration.nanos < 0) {
duration.seconds -= 1; duration.nanos += 1000000000;
}
Example 2: Compute Timestamp from Timestamp + Duration in pseudo code.
Timestamp start = …; Duration duration = …; Timestamp end = …;
end.seconds = start.seconds + duration.seconds; end.nanos = start.nanos + duration.nanos;
- if (end.nanos < 0) {
end.seconds -= 1; end.nanos += 1000000000;
- } else if (end.nanos >= 1000000000) {
end.seconds += 1; end.nanos -= 1000000000;
}
Example 3: Compute Duration from datetime.timedelta in Python.
td = datetime.timedelta(days=3, minutes=10) duration = Duration() duration.FromTimedelta(td)
# JSON Mapping
In JSON format, the Duration type is encoded as a string rather than an object, where the string ends in the suffix “s” (indicating seconds) and is preceded by the number of seconds, with nanoseconds expressed as fractional seconds. For example, 3 seconds with 0 nanoseconds should be encoded in JSON format as “3s”, while 3 seconds and 1 nanosecond should be expressed in JSON format as “3.000000001s”, and 3 seconds and 1 microsecond should be expressed in JSON format as “3.000001s”.
Field |
Type |
Label |
Description |
---|---|---|---|
seconds |
Signed seconds of the span of time. Must be from -315,576,000,000 to +315,576,000,000 inclusive. Note: these bounds are computed from: 60 sec/min * 60 min/hr * 24 hr/day * 365.25 days/year * 10000 years |
||
nanos |
Signed fractions of a second at nanosecond resolution of the span of time. Durations less than one second are represented with a 0 seconds field and a positive or negative nanos field. For durations of one second or more, a non-zero value for the nanos field must be of the same sign as the seconds field. Must be from -999,999,999 to +999,999,999 inclusive. |
google/protobuf/struct.proto#
ListValue#
ListValue is a wrapper around a repeated field of values.
The JSON representation for ListValue is JSON array.
Struct#
Struct represents a structured data value, consisting of fields which map to dynamically typed values. In some languages, Struct might be supported by a native representation. For example, in scripting languages like JS a struct is represented as an object. The details of that representation are described together with the proto support for the language.
The JSON representation for Struct is JSON object.
Field |
Type |
Label |
Description |
---|---|---|---|
fields |
repeated |
Unordered map of dynamically typed values. |
Struct.FieldsEntry#
Value#
Value represents a dynamically typed value which can be either null, a number, a string, a boolean, a recursive struct value, or a list of values. A producer of value is expected to set one of these variants. Absence of any variant indicates an error.
The JSON representation for Value is JSON value.
Field |
Type |
Label |
Description |
---|---|---|---|
null_value |
Represents a null value. |
||
number_value |
Represents a double value. |
||
string_value |
Represents a string value. |
||
bool_value |
Represents a boolean value. |
||
struct_value |
Represents a structured value. |
||
list_value |
Represents a repeated Value. |
NullValue#
NullValue is a singleton enumeration to represent the null value for the Value type union.
The JSON representation for NullValue is JSON null.
Name |
Number |
Description |
---|---|---|
NULL_VALUE |
0 |
Null value. |