Skip to content

Databricks_tags are set every run #1288

@SpaltmanDionne

Description

@SpaltmanDionne

Describe the feature

The current situation is that all databricks-tags are applied every dbt run. This adds significantly to the time of a run. In our situation that means setting one tag takes around 300 milliseconds. We want to tag each column in our dbt project: setting the tags for this much columns doesn't scale very well. The production workload takes more time and it doesn't work well for developers who want to test their code.

We would like to know if it's possible that only new or changed databricks-tags will be applied, instead of all. As can be seen in the adapter, every time a table is created, all tags are set. In the apply_tags macro there is no option in which tags that have already been set are skipped (even though the tags retain).

We do see that the tags are being fetched already, but we do not see this used yet. We are wondering if this functionality, so that not all tags are being reset with every run, is on the roadmap of the maintainers of this adapter.

System information

The output of dbt --version:

Core:
  - installed: 1.10.13
  - latest:    1.11.2  - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - databricks: 1.11.0 - Update available!
  - spark:      1.9.3  - Update available!

  At least one plugin is out of date with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

The operating system you're using:

MacOS

The output of python --version:

Python 3.12.12

The dbt-databricks version we're using:

dbt-databricks==1.10.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions