Kubeflow Pytorch

This plugin uses the Kubeflow Pytorch Operator and provides an extremely simplified interface for executing distributed training using various pytorch backends.

Installation

To use the flytekit distributed pytorch plugin simply run the following:

pip install flytekitplugins-kfpytorch

How to build your Dockerfile for Pytorch on K8s

Note

If using CPU for training then special dockerfile is NOT REQUIRED. If GPU or TPUs are required then, the dockerfile differs only in the driver setup. The following dockerfile is enabled for GPU accelerated training using CUDA The checked in version of docker file uses python:3.8-slim-buster for faster CI but you can use the Dockerfile pasted below which uses cuda base. Additionally the requirements.in uses the cpu version of pytorch. Remove the + cpu for torch and torchvision in requirements.in and make all requirements as shown below

make -C integrations/kubernetes/kfpytorch requirements
 1FROM pytorch/pytorch:1.7.0-cuda11.0-cudnn8-runtime=
 2LABEL org.opencontainers.image.source https://github.com/flyteorg/flytesnacks
 3
 4WORKDIR /root
 5ENV LANG C.UTF-8
 6ENV LC_ALL C.UTF-8
 7ENV PYTHONPATH /root
 8
 9# Install basics
10RUN apt-get update && apt-get install -y make build-essential libssl-dev curl
11
12# Install the AWS cli separately to prevent issues with boto being written over
13RUN pip install awscli
14
15ENV VENV /opt/venv
16# Virtual environment
17RUN python3 -m venv ${VENV}
18ENV PATH="${VENV}/bin:$PATH"
19
20# Install Python dependencies
21COPY kfpytorch/requirements.txt /root
22RUN pip install -r /root/requirements.txt
23
24# Copy the makefile targets to expose on the container. This makes it easier to register.
25COPY in_container.mk /root/Makefile
26COPY kfpytorch/sandbox.config /root
27
28# Copy the actual code
29COPY kfpytorch/ /root/kfpytorch/
30
31# This tag is supplied by the build script and will be used to determine the version
32# when registering tasks, workflows, and launch plans
33ARG tag
34ENV FLYTE_INTERNAL_IMAGE $tag

Gallery generated by Sphinx-Gallery