Visualizing Artifacts#
Flyte Deck
s are one of the first-class constructs in
Flyte, allowing you to generate static HTML reports associated with any of the
artifacts materialized within your tasks.
You can think of Decks as stacks of HTML snippets that are logically grouped by tabs. By default, every task has three decks: an input, an output, and a default deck.
Flyte materializes Decks via Renderer
s, which are specific implementations of
how to generate an HTML report from some Python object.
Enabling Flyte Decks#
To enable Flyte Decks, simply set disable_deck=False
in the @task
decorator:
import pandas as pd
from flytekit import task, workflow
@task(disable_deck=False)
def iris_data() -> pd.DataFrame:
...
Specifying this flag indicates that Decks should be rendered whenever this task is invoked.
Rendering Task Inputs and Outputs#
By default, Flyte will render the inputs and outputs of tasks with the built-in
renderers in the corresponding input and output Deck
s,
respectively. In the following task, we load the iris dataset using the plotly
package.
import plotly.express as px
from typing import Optional
from flytekit import task, workflow
@task(disable_deck=False)
def iris_data(
sample_frac: Optional[float] = None,
random_state: Optional[int] = None,
) -> pd.DataFrame:
data = px.data.iris()
if sample_frac is not None:
data = data.sample(frac=sample_frac, random_state=random_state)
return data
@workflow
def wf(
sample_frac: Optional[float] = None,
random_state: Optional[int] = None,
):
iris_data(sample_frac=sample_frac, random_state=random_state)
Then, invoking the workflow containing a deck-enabled task will render the following reports for the input and output data in an HTML file, which you can see in the logs:
wf(sample_frac=1.0, random_state=42)
{"asctime": "2023-09-22 03:27:07,571", "name": "flytekit", "levelname": "INFO", "message": "iris_data task creates flyte deck html to file:///tmp/flyte-puqqxhxv/sandbox/local_flytekit/908feceb2b918f48264f14ba66578fa7/deck.html"}
Note
To see where the HTML file is written to when you run the deck-enabled tasks
locally, you need to set the FLYTE_SDK_LOGGING_LEVEL
environment variable
to 20
. Doing so will emit logs that look like the above print statement,
where the deck.html
filepath can be found in the message
key.
Rendering In-line Decks#
You can render Decks inside the task function body by using the default
deck, which you can access with the current_context()
function. In the following example, we extend the iris_data
task with:
A markdown snippet to provide more context about what the task does.
A boxplot of the
sepal_length
variable usingBoxRenderer
, which leverages theplotly
package to auto-generate a set of plots and summary statistics from the dataframe.
import flytekit
from flytekitplugins.deck.renderer import MarkdownRenderer, BoxRenderer
@task(disable_deck=False)
def iris_data(
sample_frac: Optional[float] = None,
random_state: Optional[int] = None,
) -> pd.DataFrame:
data = px.data.iris()
if sample_frac is not None:
data = data.sample(frac=sample_frac, random_state=random_state)
md_text = (
"# Iris Dataset\n"
"This task loads the iris dataset using the `plotly` package."
)
flytekit.current_context().default_deck.append(MarkdownRenderer().to_html(md_text))
flytekit.Deck("box plot", BoxRenderer("sepal_length").to_html(data))
return data
This will create new tab in the Flyte Deck HTML view named default, which should contain the markdown text we specified.
Custom Renderers#
What if we don’t want to show raw data values in the Flyte Deck? We can create a
pandas dataframe renderer that summarizes the data instead of showing raw values
by creating a custom renderer. A renderer is essentially a class with a
to_html
method.
class DataFrameSummaryRenderer:
def to_html(self, df: pd.DataFrame) -> str:
assert isinstance(df, pd.DataFrame)
return df.describe().to_html()
Then we can use the Annotated
type to override the default renderer of the
pandas.DataFrame
type:
try:
from typing import Annotated
except ImportError:
from typing_extensions import Annotated
@task(disable_deck=False)
def iris_data(
sample_frac: Optional[float] = None,
random_state: Optional[int] = None,
) -> Annotated[pd.DataFrame, DataFrameSummaryRenderer()]:
data = px.data.iris()
if sample_frac is not None:
data = data.sample(frac=sample_frac, random_state=random_state)
md_text = (
"# Iris Dataset\n"
"This task loads the iris dataset using the `plotly` package."
)
flytekit.current_context().default_deck.append(MarkdownRenderer().to_html(md_text))
flytekit.Deck("box plot", BoxRenderer("sepal_length").to_html(data))
return data
Finally, we can run the workflow and embed the resulting html file by parsing out the filepath from logs: