Integrations#
Flyte is designed to be highly extensible and can be customized in multiple ways.
Note
Want to contribute an example? Check out the Example Contribution Guide.
Flytekit Plugins#
Flytekit plugins are simple plugins that can be implemented purely in python, unit tested locally and allow extending Flytekit functionality. These plugins can be anything and for comparison can be thought of like Airflow Operators.
Execute SQL queries as tasks. |
|
Validate data with |
|
Execute Jupyter Notebooks with |
|
Validate pandas dataframes with |
|
Scale pandas workflows with |
|
Version your SQL database with |
|
Run and test your |
|
|
|
|
|
Convert ML models to ONNX models seamlessly. |
|
Run analytical queries using DuckDB. |
Native Backend Plugins#
Native Backend Plugins are the plugins that can be executed without any external service dependencies because the compute is orchestrated by Flyte itself, within its provisioned Kubernetes clusters.
Execute K8s pods for arbitrary workloads. |
|
Run Dask jobs on a K8s Cluster. |
|
Run Spark jobs on a K8s Cluster. |
|
Run distributed PyTorch training jobs using |
|
Run distributed TensorFlow training jobs using |
|
Run distributed deep learning training jobs using Horovod and MPI. |
|
Run Ray jobs on a K8s Cluster. |
Flyte agents#
Flyte agents are long-running, stateless services that receive execution requests via gRPC and initiate jobs with appropriate external or internal services. Each agent service is a Kubernetes deployment that receives gRPC requests from FlytePropeller when users trigger a particular type of task. (For example, the BigQuery agent handles BigQuery tasks.) The agent service then initiates a job with the appropriate service. If you don’t see the agent you need below, see “Developing agents” to learn how to develop a new agent.
Run Airflow jobs in your workflows with the Airflow agent. |
|
Run BigQuery jobs in your workflows with the BigQuery agent. |
|
Run ChatGPT jobs in your workflows with the ChatGPT agent. |
|
Run Databricks jobs in your workflows with the Databricks agent. |
|
Execute tasks using the MemVerge Memory Machine Cloud agent. |
|
Run sensor jobs in your workflows with the sensor agent. |
|
Run Snowflake jobs in your workflows with the Snowflake agent. |
External Service Backend Plugins#
As the term suggests, external service backend plugins rely on external services like Hive for handling the workload defined in the Flyte task that uses the respective plugin.
Train models with built-in or define your own custom algorithms. |
|
Train Pytorch models using Sagemaker, with support for distributed training. |
|
Execute queries using AWS Athena |
|
Running tasks and workflows on AWS batch service |
|
Execute tasks using Flyte Interactive to debug. |
|
Run Hive jobs in your workflows. |
SDKs for Writing Tasks and Workflows#
The community would love to help you with your own ideas of building a new SDK. Currently the available SDKs are:
The Python SDK for Flyte. |
|
The Java/Scala SDK for Flyte. |
Flyte Operators#
Flyte can be integrated with other orchestrators to help you leverage Flyte’s constructs natively within other orchestration tools.
Trigger Flyte executions from Airflow. |