Meet Flyte

The workflow automation platform for complex, mission-critical data and ML processes at scale

Flyte is an open-source, container-native, structured programming and distributed processing platform. It enables highly concurrent, scalable and maintainable workflows for machine learning and data processing.

Created at Lyft in collaboration with Spotify, Freenome and many others, Flyte provides first class support for Python, Java, and Scala, and is built directly on Kubernetes for all the benefits containerization provides: portability, scalability, and reliability.

The core unit of execution in Flyte is the task, which you can easily write with the Flytekit SDK:

@task
def greet(name: str) -> str:
    return f"Welcome, {name}!"

You can compose one or more tasks to create a workflow:

@task
def add_question(greeting: str) -> str:
    return f"{greeting} How are you?"

@workflow
def welcome(name: str) -> str:
    greeting = greet(name=name)
    return add_question(greeting=greeting)

welcome(name="Traveler")
# Output: "Welcome, Traveler! How are you?"
@AutoService(SdkRunnableTask.class)
public class GreetTask extends SdkRunnableTask<GreetTask.Input, GreetTask.Output> {
  public GreetTask() {
    super(JacksonSdkType.of(Input.class), JacksonSdkType.of(Output.class));
  }

  public static SdkTransform of(SdkBindingData name) {
    return new GreetTask().withInput("name", name);
  }

  @AutoValue
  public abstract static class Input {
    public abstract String name();
  }

  @AutoValue
  public abstract static class Output {
    public abstract String greeting();

    public static Output create(String greeting) {
      return new AutoValue_GreetTask_Output(greeting);
    }
  }

  @Override
  public Output run(Input input) {
    return Output.create(String.format("Welcome, %s!", input.name()));
  }
}

You can compose one or more tasks to create a workflow:

@AutoService(SdkRunnableTask.class)
public class AddQuestionTask extends SdkRunnableTask<AddQuestionTask.Input, AddQuestionTask.Output> {
  public AddQuestionTask() {
    super(JacksonSdkType.of(Input.class), JacksonSdkType.of(Output.class));
  }

  public static SdkTransform of(SdkBindingData greeting) {
    return new AddQuestionTask().withInput("greeting", greeting);
  }

  @AutoValue
  public abstract static class Input {
    public abstract String greeting();
  }

  @AutoValue
  public abstract static class Output {
    public abstract String greeting();

    public static Output create(String greeting) {
      return new AutoValue_AddQuestionTask_Output(greeting);
    }
  }

  @Override
  public Output run(Input input) {
    return Output.create(String.format("%s How are you?", input.greeting()));
  }
}
@AutoService(SdkWorkflow.class)
public class WelcomeWorkflow extends SdkWorkflow {

  @Override
  public void expand(SdkWorkflowBuilder builder) {
    // defines the input of the workflow
    SdkBindingData name = builder.inputOfString("name", "The name for the welcome message");

    // uses the workflow input as the task input of the GreetTask
    SdkBindingData greeting = builder.apply("greet", GreetTask.of(name)).getOutput("greeting");

    // uses the output of the GreetTask as the task input of the AddQuestionTask
    SdkBindingData greetingWithQuestion =
        builder.apply("add-question", AddQuestionTask.of(greeting)).getOutput("greeting");

    // uses the task output of the AddQuestionTask as the output of the workflow
    builder.output("greeting", greetingWithQuestion, "Welcome message");
  }
}

Link to the example code: WelcomeWorkflow.java

case class GreetTaskInput(name: String)
case class GreetTaskOutput(greeting: String)

class GreetTask
    extends SdkRunnableTask(
      SdkScalaType[GreetTaskInput],
      SdkScalaType[GreetTaskOutput]
    ) {

  override def run(input: GreetTaskInput): GreetTaskOutput = GreetTaskOutput(s"Welcome, ${input.name}!")
}

object GreetTask {
  def apply(name: SdkBindingData): SdkTransform =
    new GreetTask().withInput("name", name)
}

You can compose one or more tasks to create a workflow:

case class AddQuestionTaskInput(greeting: String)
case class AddQuestionTaskOutput(greeting: String)

class AddQuestionTask
    extends SdkRunnableTask(
      SdkScalaType[AddQuestionTaskInput],
      SdkScalaType[AddQuestionTaskOutput]
    ) {

  override def run(input: AddQuestionTaskInput): AddQuestionTaskOutput = AddQuestionTaskOutput(s"${input.greeting} How are you?")
}

object AddQuestionTask {
  def apply(greeting: SdkBindingData): SdkTransform =
    new AddQuestionTask().withInput("greeting", greeting)
}
class WelcomeWorkflow extends SdkWorkflow {

  def expand(builder: SdkWorkflowBuilder): Unit = {
    // defines the input of the workflow
    val name = builder.inputOfString("name", "The name for the welcome message")

    // uses the workflow input as the task input of the GreetTask
    val greeting = builder.apply("greet", GreetTask(name)).getOutput("greeting")

    // uses the output of the GreetTask as the task input of the AddQuestionTask
    val greetingWithQuestion = builder.apply("add-question", AddQuestionTask(greeting)).getOutput("greeting")

    // uses the task output of the AddQuestionTask as the output of the workflow
    builder.output("greeting", greetingWithQuestion, "Welcome message")
  }
}

Link to the example code: WelcomeWorkflow.scala

Why Flyte?

Flyte’s main purpose is to increase the development velocity for data processing and machine learning, enabling large-scale compute execution without the operational overhead. Teams can therefore focus on the business goals and not the infrastructure.

Core Features

  • Container Native

  • Extensible Backend & SDK’s

  • Ergonomic SDK’s in Python, Java & Scala

  • Versioned & Auditable - all actions are recorded

  • Matches your workflow - Start with one task, convert to a pipeline, attach multiple schedules or trigger using a programmatic API or on-demand

  • Battle-tested - millions of pipelines executed per month

  • Vertically-Integrated Compute - serverless experience

  • Deep understanding of data-lineage & provenance

  • Operation Visibility - cost, performance, etc.

  • Cross-Cloud Portable Pipelines

Who’s Using Flyte?

At Lyft, Flyte has served production model training and data processing for over four years, becoming the de-facto platform for the Pricing, Locations, ETA, Mapping teams, Growth, Autonomous and other teams.

For the most current list of Flyte’s deployments, please click here.

Next Steps

Whether you want to write Flyte workflows, deploy the Flyte platform to your k8 cluster, or extend and contribute to the architecture and design of Flyte, we have what you need.