flytekitplugins.spark.Spark¶

class flytekitplugins.spark.Spark(spark_conf=None, hadoop_conf=None, executor_path=None, applications_path=None)[source]¶

Use this to configure a SparkContext for a your task. Task’s marked with this will automatically execute natively onto K8s as a distributed execution of spark

Parameters:

spark_conf (Dict[str, str] | None) – Dictionary of spark config. The variables should match what spark expects
hadoop_conf (Dict[str, str] | None) – Dictionary of hadoop conf. The variables should match a typical hadoop configuration for spark
executor_path (str | None) – Python binary executable to use for PySpark in driver and executor.
applications_path (str | None) – MainFile is the path to a bundled JAR, Python, or R file of the application to execute.

Methods

Attributes

applications_path: str | None = None

executor_path: str | None = None

hadoop_conf: Dict[str, str] | None = None

spark_conf: Dict[str, str] | None = None