flytekitplugins.spark.Spark#

class flytekitplugins.spark.Spark(spark_conf=None, hadoop_conf=None, executor_path=None, applications_path=None)[source]#

Use this to configure a SparkContext for a your task. Task’s marked with this will automatically execute natively onto K8s as a distributed execution of spark

Parameters
  • spark_conf (Optional[Dict[str, str]]) – Dictionary of spark config. The variables should match what spark expects

  • hadoop_conf (Optional[Dict[str, str]]) – Dictionary of hadoop conf. The variables should match a typical hadoop configuration for spark

  • executor_path (Optional[str]) – Python binary executable to use for PySpark in driver and executor.

  • applications_path (Optional[str]) – MainFile is the path to a bundled JAR, Python, or R file of the application to execute.

Return type

None

Methods

Attributes

applications_path: Optional[str] = None
executor_path: Optional[str] = None
hadoop_conf: Optional[Dict[str, str]] = None
spark_conf: Optional[Dict[str, str]] = None