Configure Kubernetes Plugins

Tags: Kubernetes, Integration, Spark, AWS, GCP, Advanced

This guide will help you configure the Flyte plugins that provision resources on Kubernetes. The steps are defined in terms of the deployment method you used to install Flyte.

Install the Kubernetes operator

Select the integration you need and follow the steps to install the corresponding Kubernetes operator:

  1. Install the Kubeflow training-operator (Please install the stable release):

kubectl apply -k "github.com/kubeflow/training-operator/manifests/overlays/standalone?ref=v1.7.0"

Optional: Using a gang scheduler

To address potential issues with worker pods of distributed training jobs being scheduled at different times due to resource constraints, you can opt for a gang scheduler. This ensures that all worker pods are scheduled simultaneously, reducing the likelihood of job failures caused by timeout errors.

To enable gang scheduling for the training-operator:

a. Select a second scheduler from Kubernetes scheduler plugins with co-scheduling or Apache YuniKorn .

  1. Configure a Flyte PodTemplate to use the gang scheduler for your Tasks:

K8s scheduler plugins with co-scheduling

template:
  spec:
    schedulerName: "scheduler-plugins-scheduler"

Apache Yunikorn

template:
  metadata:
    annotations:
      yunikorn.apache.org/task-group-name: ""
      yunikorn.apache.org/task-groups: ""
      yunikorn.apache.org/schedulingPolicyParameters: ""

See Configuring task pods with K8s PodTemplates for more information about Pod templates in Flyte. You can set the scheduler name in the Pod template passed to the @task decorator. However, to prevent the two different schedulers from competing for resources, we recommend setting the scheduler name in the pod template in the flyte namespace which is applied to all tasks. Non distributed training tasks can be scheduled by the gang scheduler as well.

Specify plugin configuration

Create a file named values-override.yaml and add the following config to it:

configuration:
  inline:
    tasks:
      task-plugins:
        enabled-plugins:
          - container
          - sidecar
          - k8s-array
          - pytorch
        default-for-task-types:
          - container: container
          - container_array: k8s-array
          - pytorch: pytorch

Upgrade the deployment

helm upgrade <RELEASE_NAME> flyteorg/flyte-binary -n <YOUR_NAMESPACE> --values values-override.yaml

Replace <RELEASE_NAME> with the name of your release (e.g., flyte-backend), <YOUR_NAMESPACE> with the name of your namespace (e.g., flyte).

Wait for the upgrade to complete. You can check the status of the deployment pods by running the following command:

kubectl get pods -n flyte

Once all the components are up and running, go to the examples section to learn more about how to use Flyte backend plugins.