DatabricksConnector

With DatabricksConnector, you can integrate metadata from Databricks Unity Catalog directly into dataspot. Data assets, their structures and the data lineage of the Databricks platform are transferred – even at column level.

DatabricksConnector is an ETL application for businesses and is provided as a standalone Java package.

DatabricksConnector connects directly to the Databricks Unity Catalog and

  • extracts metadata and lineage information from the workspace,
  • applies filters and rules,
  • transforms the raw data to assets in the destination application,
  • uploads metadata using the upload API.
Screenshot of the Databricks-Connector in the dataspot.-Software

Functions

Dump and restore functions

Embedded working database for temporary storage of entities during processing

Advanced metadata and pattern filters

Automatic planning and execution of services

Standardised authentication (OAuth 2.0, basic authentication, tokens, key files, etc.)

Intelligent, agent-based matching mode

Workflow support

Architecture

The software package contains all necessary components: executable files, resources, configurations, third-party libraries, an embedded database system and a web server. No external components are required to get started.

Plug & play for your metadata processing.

Image

Performance & Scalability

The DatabricksConnector requires little memory and can process an unlimited amount of metadata without loading the entire data set into memory. It uses an embedded database with:

  • a landing repository for raw data
  • a staging repository for transformed entities prior to final upload

And all this without the need to manage external databases.

Fast, scalable metadata integration with minimal resource requirements – completely without external dependencies.

Interested in more connectivity?