Databricks
Databricks is a unified data analytics platform that provides a collaborative environment for data scientists, engineers, and business analysts. It is built on top of Apache Spark and provides a fully managed, scalable, and secure cloud-based platform for big data analytics.
π§ Setup Guide for Databricks in WELD
Databricks can be used with WELD by leveraging its integration directly with Unity Catalog. Weld uses the M2M OAuth application for establishing connection.
Step 1: Configure Databricks service principal

-
Inside your Databricks deployment you go to:
Settings -> identity and access -> service principals -> manage
-
Then press the Add service principal and select add new. Then choose datbricks managed and a name (fx
weld-service-principal
) and press add. -
When you have added that service principal go to the page of it. Press Secrets and generate a new secret. Note down the
Client ID
andSecret
you need them during setup in Weld.
Step 2: Setup the SQL Warehouse

-
In the Databricks console, go to
SQL -> Create -> SQL Warehouse
. -
We recommend starting with 2X-Small size and scaling up as your workload increases.
-
Set the timeout to 5 min. And choose the
warehouse type
you want. -
Go to the connection details and note down the
Server hostname
andHTTP path
. You will need those when configuring the connection in Weld. -
Go to premissions and add your
weld-service-principal
to permissions and setcan use
.
Step 3: Setup the Unity Catalog

- In the Databricks console, go to
Catalog -> + -> Create a catalog
. - Enter name of the catalog (weld) and choose a storage location.
- Go to your newly created catalog and Set permissions for the
weld-service-principal
:- Prerequisite USE SCHEMA
- Create CREATE VOLUME
- Edit WRITE VOLUME
- Read EXECUTE, READ VOLUME, SELECT
- Create a new schema
- Go to the newly created schema and create a new volume
- And copy the URL under the Description. It will look something like this:
/Volumes/weld_databricks/weld_catalog/weld_volume
You now have all the settings you need to setup Weld with Databricks and start syncing your data.