Discover data with Catalog
With Catalog, you can view your project's resources (such as models, tests, and metrics), their lineage, and model consumption to gain a better understanding of its latest production state.
Use Catalog to navigate and manage your projects within dbt to help you and other data developers, analysts, and consumers discover and leverage your dbt resources. Catalog integrates with the Studio IDE, dbt Insights, Orchestrator, and Canvas to help you develop or view your dbt resources.
Prerequisites
- You have a dbt account on the Starter, Enterprise, or Enterprise+ plan.
- You have set up a production or staging deployment environment for each project you want to explore.
- You have at least one successful job run in the deployment environment. Note that CI jobs do not update Catalog.
- You are on the Catalog page. To do this, select Explore from the navigation in dbt.
Generate metadata
Catalog uses the metadata provided by the Discovery API to display the details about the state of your dbt project. The metadata that's available depends on the deployment environment you've designated as production or staging in your dbt project.
Catalog also allows you to ingest external metadata from Snowflake, giving you visibility into tables, views, and other resources that aren't defined in dbt with Catalog.
dbt metadata
If you're using a hybrid project setup and uploading artifacts from dbt Core, make sure to follow the setup instructions to connect your project in dbt. This enables Catalog to access and display your metadata correctly.
- To ensure all metadata is available in Catalog, run
dbt build
anddbt docs generate
as part of your job in your production or staging environment. Running those two commands ensure all relevant metadata (like lineage, test results, documentation, and more) is available in dbt Explorer. - Catalog automatically retrieves the metadata updates after each job run in the production or staging deployment environment so it always has the latest results for your project. This includes deploy and merge jobs.
- Note that CI jobs don't update Catalog. This is because they don't reflect the production state and don't provide the necessary metadata updates.
- To view a resource and its metadata, you must define the resource in your project and run a job in the production or staging environment.
- The resulting metadata depends on the commands executed by the jobs.
Note that Catalog automatically deletes stale metadata after 3 months if no jobs were run to refresh it. To avoid this, make sure you schedule jobs to run more frequently than 3 months with the necessary commands.
To view in Catalog | You must successfully run |
---|---|
All metadata | dbt build, dbt docs generate, and dbt source freshness together as part of the same job in the environment |
Model lineage, details, or results | dbt run or dbt build on a given model within a job in the environment |
Columns and statistics for models, sources, and snapshots | dbt docs generate within a job in the environment |
Test results | dbt test or dbt build within a job in the environment |
Source freshness results | dbt source freshness within a job in the environment |
Snapshot details | dbt snapshot or dbt build within a job in the environment |
Seed details | dbt seed or dbt build within a job in the environment |
Richer and more timely metadata will become available as dbt evolves.
If your organization works in both dbt Core and Cloud, you can unify these workflows by automatically uploading dbt Core artifacts into dbt Cloud and viewing them in Catalog for a more connected dbt experience. To learn more, visit hybrid projects.
External metadata ingestion preview
Connect directly to your data warehouse with external metadata ingestion, giving you visibility into tables, views, and other resources that aren't defined in dbt with Catalog.
We create dbt metadata and pull external metadata. Catalog uses the metadata provided by the Discovery API to display details about the state of your project. The available metadata depends on which deployment environment you’ve designated as production or staging in your dbt project.
Catalog overview
Catalog introduces the ability to widen your search by searching your dbt resources (models, seeds, snapshots, sources, exposures and more so) across your entire account. This broadens the results returned and gives you greater insight into all the assets across your dbt projects.
To enable global navigation:
- Have a developer license with Owner permissions.
- Navigate to your account settings in your dbt account and check the box to Enable dbt Catalog's Global Navigation.
Navigate the Catalog overview page to access your project's resources and metadata. The page includes the following sections:
- Search bar — Search for resources in your project by keyword. You can also use filters to refine your search results.
- Sidebar — Use the left sidebar to access model performance, project recommendations in the Project details section. Browse your project's resources, file tree, and database in the lower section of the sidebar.
- Find your project recommendations within your project's landing page.*
- Lineage graph — Explore your project's or account's lineage graph to visualize the relationships between resources.
- Latest updates — View the latest changes or issues related to your project's resources, including the most recent job runs, changed properties, lineage, and issues.
- Marts and public models — View the marts and public models in your project. You can also navigate to all public models in your account through this view.
- Model query history — Use model query history to track consumption queries on your models for deeper insights.
- Visualize downstream exposures — Set up and visualize downstream exposures to automatically expose relevant data models from Tableau to enhance visibility.
- Data health signals — View the data-health-signals for each resource to understand its health and performance.
Catalog permissions
When using global navigation and searching across your projects, the following permissions apply.
- Your project access permissions determine which dbt projects appear in the left-hand menu of the global navigation.
- In Catalog searches, we use soft access controls, you'll see all matching resources in search results, with clear indicators for items you don't have access to.
- For external metadata, the global platform credential controls which resources metadata users can discover. See External metadata ingestion for more details.
If you enjoy video courses, check out our dbt Catalog on-demand course and learn how to best explore your dbt project(s)!
Explore your project's lineage graph
Catalog provides a visualization of your project's DAG that you can interact with. To access the project's full lineage graph, select Overview in the left sidebar and click the Explore Lineage button on the main (center) section of the page.
If you don't see the project lineage graph immediately, click Render Lineage. It can take some time for the graph to render depending on the size of your project and your computer's available memory. The graph of very large projects might not render so you can select a subset of nodes by using selectors, instead.
The nodes in the lineage graph represent the project's resources and the edges represent the relationships between the nodes. Nodes are color-coded and include iconography according to their resource type.
By default, Catalog shows the project's applied state lineage. That is, it shows models that have been successfully built and are available to query, not just the models defined in the project.
To explore the lineage graphs of tests and macros, view their resource details pages. By default, Catalog excludes these resources from the full lineage graph unless a search query returns them as results.
Example of full lineage graph
Example of exploring a model in the project's lineage graph:
Lenses
The Lenses feature is available from your project's lineage graph (lower right corner). Lenses are like map layers for your DAG. Lenses make it easier to understand your project's contextual metadata at scale, especially to distinguish a particular model or a subset of models.
When you apply a lens, tags become visible on the nodes in the lineage graph, indicating the layer value along with coloration based on that value. If you're significantly zoomed out, only the tags and their colors are visible in the graph.
Lenses are helpful to analyze a subset of the DAG if you're zoomed in, or to find models/issues from a larger vantage point.
Example of lenses
Example of applying the Materialization type lens with the lineage graph zoomed out. In this view, each model name has a color according to the materialization type legend at the bottom, which specifies the materialization type. This color-coding helps to quickly identify the materialization types of different models.
Example of applying the Tests Status lens, where each model name displays the tests status according to the legend at the bottom, which specifies the test status.
Keyword search
With Catalog, global navigation provides a search experience allowing you to find dbt resources across all your projects, as well as non-dbt resources in Snowflake.
You can locate resources in your project by performing a keyword search in the search bar. All resource names, column names, resource descriptions, warehouse relations, and code matching your search criteria will be displayed as a list on the main (center) section of the page. When searching for an exact column name, the results show all relational nodes containing that column in their schemas. If there's a match, a notice in the search result indicates the resource contains the specified column. Also, you can apply filters to further refine your search results.
Example of keyword search
Example of results from searching on the keyword customers
and applying the filters models, description, and code. Data health signals are visible to the right of the model name in the search results.
Browse with the sidebar
From the sidebar, you can browse your project's resources, its file tree, and the database.
- Resources tab — All resources in the project organized by type. Select any resource type in the list and all those resources in the project will display as a table in the main section of the page. For a description on the different resource types (like models, metrics, and so on), refer to About dbt projects.
- Data health signals are visible to the right of the resource name under the Health column.
- File Tree tab — All resources in the project organized by the file in which they are defined. This mirrors the file tree in your dbt project repository.
- Database tab — All resources in the project organized by the database and schema in which they are built. This mirrors your data platform's structure that represents the applied state of your project.
Integrated tool access
Users with a developer license or an analyst seat can open a resource directly from the Catalog in the Studio IDE to view its model files, in Insights to query it, or in Canvas for visual editing.
View model versions
If models in the project are versioned, you can see which version of the model is being applied — prerelease
, latest
, and old
— in the title of the model's details page and in the model list from the sidebar.
View resource details
You can view the definition and latest run results of any resource in your project. To find a resource and view its details, you can interact with the lineage graph, use search, or browse the Catalog.
The details (metadata) available to you depends on the resource's type, its definition, and the commands that run within jobs in the production environment.
In the upper right corner of the resource details page, you can:
- Click the Open in Studio IDE icon to examine the resource using the Studio IDE.
- Click the Share icon to copy the page's link to your clipboard.
Example of model details
Staging environment
Catalog supports views for staging deployment environments, in addition to the production environment. This gives you a unique view into your pre-production data workflows, with the same tools available in production, while providing an extra layer of scrutiny.
You can explore the metadata from your production or staging environment to inform your data development lifecycle. Just set a single environment per dbt project as "production" or "staging," and ensure the proper metadata has been generated then you'll be able to view it in Catalog. Refer to Generating metadata for more details.