Databricks Plugin Setup#
This guide gives an overview of how to set up Databricks in your Flyte deployment.
Add Flyte chart repo to Helm
helm repo add flyteorg https://flyteorg.github.io/flyte
Setup the cluster
Start the sandbox cluster
flytectl sandbox start
Generate Flytectl sandbox config
flytectl config init
Upload an entrypoint.py to dbfs or s3. Spark driver node run this file to override the default command in the dbx job.
Create a file named
values-override.yaml
and add the following config to it:
configmap:
enabled_plugins:
# -- Tasks specific configuration [structure](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#GetConfig)
tasks:
# -- Plugins configuration, [structure](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#TaskPluginConfig)
task-plugins:
# -- [Enabled Plugins](https://pkg.go.dev/github.com/flyteorg/flyteplugins/go/tasks/config#Config). Enable sagemaker*, athena if you install the backend
# plugins
enabled-plugins:
- container
- sidecar
- k8s-array
- databricks
default-for-task-types:
container: container
sidecar: sidecar
container_array: k8s-array
spark: databricks
databricks:
enabled: True
plugin_config:
plugins:
databricks:
entrypointFile: dbfs:///FileStore/tables/entrypoint.py
databricksInstance: dbc-a53b7a3c-614c
Create a Databricks account and follow the docs for creating an Access token.
Create a Instance Profile for the Spark cluster, it allows the spark job to access your data in the s3 bucket.
Add Databricks access token to FlytePropeller.
Note
Refer to the access token to understand setting up the Databricks access token.
kubectl edit secret -n flyte flyte-secret-auth
The configuration will look as follows:
apiVersion: v1
data:
FLYTE_DATABRICKS_API_TOKEN: <ACCESS_TOKEN>
client_secret: Zm9vYmFy
kind: Secret
metadata:
annotations:
meta.helm.sh/release-name: flyte
meta.helm.sh/release-namespace: flyte
...
Replace <ACCESS_TOKEN>
with your access token.
Upgrade the Flyte Helm release.
helm upgrade -n flyte -f https://raw.githubusercontent.com/flyteorg/flyte/master/charts/flyte-core/values-sandbox.yaml -f values-override.yaml flyteorg/flyte-core