AWS Batch Setup#
This setup document applies to both
and single tasks running on AWS Batch.
For single [non-map] task use, please take note of the additional code when updating Propeller config.
AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS.
Flyte abstracts away the complexity of integrating AWS Batch into users’ workflows. It takes care of packaging inputs, reading outputs, scheduling map tasks, leveraging AWS Batch Job Queues to distribute the load and coordinate priorities.
Set-up AWS Batch#
Follow the guide Running batch jobs at scale for less.
At the end of this step, the AWS Account should have a configured compute environment and one or more AWS Batch Job Queues.
Modify Users’ AWS IAM Role Trust Policy Document#
Follow the guide AWS Batch Execution IAM role.
When running Workflows in Flyte, users have the option to specify a K8s Account and/or an IAM Role to run as. For AWS Batch, an IAM Role must be specified. For every one of these IAM Roles, modify the trust policy to allow ECS to assume the role.
Modify System’s AWS IAM Role policies#
Follow the guide: Granting a user permissions to pass a role to an AWS service.
The recommended way of assigning permissions to flyte components is using OIDC. This involves assigning an IAM Role for every service account used. You will need to find the IAM Role assigned to the flytepropeller’s kubernetes service account. Then modify the policy document to allow the role to pass other roles to AWS Batch.
Update FlyteAdmin Config#
FlyteAdmin needs to be made aware of all the AWS Batch Job Queues and how the system should distribute the load onto them. The simplest setup looks something like this:
flyteadmin: roleNameKey: "eks.amazonaws.com/role-arn" queues: # A list of items, one per AWS Batch Job Queue. executionQueues: # The name of the job queue from AWS Batch - dynamic: "tutorial" # A list of tags/attributes that can be used to match workflows to this queue. attributes: - default # A list of configs to match project and/or domain and/or workflows to job queues using tags. workflowConfigs: # An empty rule to match any workflow to the queue tagged as "default" - tags: - default
If you are using Helm, this block can be added under configMaps.adminServer section here.
An example of a more complex matching config below defines 3 different queues with separate attributes and matching logic based on project/domain/workflowName.
queues: executionQueues: - dynamic: "gpu_dynamic" attributes: - gpu - dynamic: "critical" attributes: - critical - dynamic: "default" attributes: - default workflowConfigs: - project: "my_queue_1" domain: "production" workflowName: "my_workflow_1" tags: - critical - project: "production" workflowName: "my_workflow_2" tags: - gpu - project: "my_queue_3" domain: "production" workflowName: "my_workflow_3" tags: - critical - tags: - default
These settings can also be dynamically altered through flytectl (or flyteAdmin API). Read about the core concept here. Then visit flytectl docs for a guide on how to dynamically update these configs.
Update Flyte Propeller’s Config#
AWS Array Plugin requires some configurations to correctly communicate with AWS Batch Service.
These configurations live within flytepropeller’s configMap. The config should be modified to set the following keys:
plugins: aws: batch: # Must match that set in flyteAdmin's configMap flyteadmin.roleNameKey roleAnnotationKey: eks.amazonaws.com/role-arn # Must match the desired region to launch these tasks. region: us-east-2 tasks: task-plugins: enabled-plugins: # Enable aws_array task plugin. - aws_array default-for-task-types: # Set it as the default handler for array/map tasks. container_array: aws_array # Make sure to add this line to enable single (non-map) AWS Batch tasks aws-batch: aws_array
Launch an Execution on AWS Batch#
Follow this guide to write a workflow with a Map Task.
Serialize and Register the workflow/task to a Flyte backend, then launch an
execution either on Flyte Console or with
Navigate to Flyte Console’s UI (e.g. sandbox) and find the workflow.
Click on Launch to open up the launch form.
Select IAM Role and enter the full AWS Arn of an IAM Role configured according to the above guide.
Submit the form.
Retrieve an execution form in the form of a yaml file:
flytectl --config ~/.flyte/flytectl.yaml get launchplan \ -p <project> -d <domain> <workflow full name> \ --version <version> --execFile ./map_wf.yaml
Fill in iamRole field (and optionally kubeServiceAcct if required in the deployment), then launch an execution:
flytectl --config ~/.flyte/flytectl.yaml create execution \ -p <project> -d <domain> \ --execFile ./map_wf.yaml
As soon as the task starts executing, a link for the AWS Array Job will appear in the log links section in Flyte Console. As individual jobs start getting scheduled, links to their individual cloudWatch log streams will also appear in the UI.
A screenshot of Flyte Console displaying log links for a successful array job.
A screenshot of Flyte Console displaying log links for a failed array job.