Skip to main content

Databricks and Apache Iceberg

Databricks is built on Delta Lake and stores data in the Delta table format. Databricks does not support writing to Iceberg catalogs. However, it does support reading from external Iceberg catalogs and creating tables readable from Iceberg clients stored as Delta tables.

When a dbt model is configured with the table property UniForm, it will duplicate the Delta metadata for an Iceberg-compatible metadata. This allows external Iceberg compute engines to read from Unity Catalogs.

Example SQL:

{{ config(
tblproperties={
'delta.enableIcebergCompatV2' = 'true'
'delta.universalFormat.enabledFormats' = 'iceberg'
}
) }}

To set up Databricks for reading and querying external tables, configure Lakehouse Federation and establish the catalog as a foreign catalog. This will be configured outside of dbt, and once completed, it will be another database you can query.

We do not currently support the new Private Priview features of Databricks managed Iceberg tables.

dbt Catalog Integration Configurations for Databricks

The following table outlines the configuration fields required to set up a catalog integration for Iceberg compatible tables in Databricks.

FieldDescriptionRequiredAccepted values
nameName of the Catalog on Databricksyes“my_unity_catalog”
catalog_typeType of catalogyesunity, hive_metastore
external_volumeStorage location of your dataoptionalSee Databricks documentation
table_formatTable Format for your dbt models will be materialized asOptional Defaults to delta unless overwritten in Databricks account.default, iceberg
adapter_properties:Additional Platform-Specific Properties.OptionalSee below for acceptable values

Here are optional fields for adapter properties:

FieldDescriptionRequiredAccepted values
file_formatOptional, Defaults to delta unless overwritten in Databricks account.delta (default), parquet, hudi

Example:

adapter_properties:
file_format: parquet

Configure catalog integration for managed Iceberg tables

  1. Create a catalogs.yml at the top level of your dbt project (at the same level as dbt_project.yml)

    An example of Unity Catalog as the catalog:

catalogs:
- name: unity_catalog
active_write_integration: unity_catalog_integration
write_integrations:
- name: unity_catalog_integration
table_format: iceberg
catalog_type: unity

  1. Apply the catalog configuration at either the model, folder, or project level.

    An example of iceberg_model.sql:

{{
config(
materialized='table',
catalog = unity_catalog

)
}}

select * from {{ ref('jaffle_shop_customers') }}

  1. Execute the dbt model with a dbt run -s iceberg_model.
0