Getting Started

Requirements

Make sure you have Docker and the Docker Daemon is running.

Installation

Install Flytekit, Flyte’s python SDK.

pip install flytekit

Example: Computing Descriptive Statistics

Let’s create a simple Flyte Workflow that involves two steps:

  1. Generate a dataset of numbers drawn from a normal distribution.

  2. Compute the mean and standard deviation of the numbers data.

Create a Workflow

Copy the following code to a file named example.py

import typing
import pandas as pd
import numpy as np

from flytekit import task, workflow

@task
def generate_normal_df(n:int, mean: float, sigma: float) -> pd.DataFrame:
    return pd.DataFrame({"numbers": np.random.normal(mean, sigma,size=n)})

@task
def compute_stats(df: pd.DataFrame) -> typing.Tuple[float, float]:
    return float(df["numbers"].mean()), float(df["numbers"].std())

@workflow
def wf(n: int = 200, mean: float = 0.0, sigma: float = 1.0) -> typing.Tuple[float, float]:
    return compute_stats(df=generate_normal_df(n=n, mean=mean, sigma=sigma))

Running Flyte Workflows

You can run the workflow in example.py on a local python environment or a Flyte cluster.

Executing Workflows Locally

Run your workflow locally using pyflyte, the CLI that ships with flytekit.

pyflyte run example.py wf --n 500 --mean 42 --sigma 2

Creating a Demo Flyte Cluster

To start a local demo cluster, first install flytectl, which is the command-line interface for Flyte.

brew install flyteorg/homebrew-tap/flytectl
curl -sL https://ctl.flyte.org/install | sudo bash -s -- -b /usr/local/bin # You can change path from /usr/local/bin to any file system path
export PATH=$(pwd)/bin:$PATH # Only required if user used different path then /usr/local/bin

Start a Flyte demonstration environment on your local machine:

flytectl demo start

Expected Output:

👨‍💻 Flyte is ready! Flyte UI is available at http://localhost:30080/console 🚀 🚀 🎉

Note

Make sure to export the KUBECONFIG and FLYTECTL_CONFIG environment variables in your shell, replacing <username> with your actual username.

Executing Workflows on a Flyte Cluster

Then run the same Workflow on the Flyte cluster:

pyflyte run --remote example.py wf --n 500 --mean 42 --sigma 2

Expected Output: A URL to the Workflow Execution on your demo Flyte cluster:

Go to http://localhost:30080/console/projects/flytesnacks/domains/development/executions/<execution_name> to see execution in the console.

Where <execution_name> is a unique identifier for your Workflow Execution.


Inspect the Results

Navigate to the URL produced as the result of running pyflyte run. This will take you to Flyte Console, the web UI used to manage Flyte entities such as tasks, workflows, and executions.

https://github.com/flyteorg/static-resources/raw/main/flyte/getting_started/getting_started_console.gif

Note

There are a few features about the Flyte console worth noting in this video:

  • The default execution view shows the list of Tasks executing in sequential order

  • The right-hand panel shows metadata about the Task Execution, including logs, inputs, outputs, and Task Metadata.

  • The Graph view shows the execution graph of the Workflow, providing visual information about the topology of the graph and the state of each node as the Workflow progresses.

  • On completion, you can inspect the outputs of each Task, and ultimately, the overarching Workflow.

Recap

🎉 Congratulations! In this getting started guide, you:

  1. 📜 Created a Flyte script, which computes descriptive statistics over some generated data.

  2. 🛥 Created a demo Flyte cluster on your local system

  3. 👟 Ran a workflow locally and on a demo Flyte cluster.

What’s Next?

This guide demonstrated how you can quickly iterate on self-contained scripts using pyflyte run.

  • To learn more about Flyte’s features such as caching, conditionals, specifying resource requirements, and scheduling workflows, take a look at the User Guide.

  • To learn more about how to organize, package, and register workflows for larger projects, see the guide for Building Large Apps.