flytekitplugins.kfmpi.MPIJob#

class flytekitplugins.kfmpi.MPIJob(slots, num_launcher_replicas=1, num_workers=1)[source]#

Configuration for an executable MPI Job. Use this to run distributed training on k8s with MPI

Parameters
  • num_workers (int) – integer determining the number of worker replicas spawned in the cluster for this job

  • master). ((in addition to 1) –

  • num_launcher_replicas (int) – Number of launcher server replicas to use

  • slots (int) – Number of slots per worker used in hostfile

Return type

None

Methods

Attributes

num_launcher_replicas: int = 1#
num_workers: int = 1#
slots: int#