Describe the feature
The current situation is that all databricks-tags are applied every dbt run. This adds significantly to the time of a run. In our situation that means setting one tag takes around 300 milliseconds. We want to tag each column in our dbt project: setting the tags for this much columns doesn't scale very well. The production workload takes more time and it doesn't work well for developers who want to test their code.
We would like to know if it's possible that only new or changed databricks-tags will be applied, instead of all. As can be seen in the adapter, every time a table is created, all tags are set. In the apply_tags macro there is no option in which tags that have already been set are skipped (even though the tags retain).
We do see that the tags are being fetched already, but we do not see this used yet. We are wondering if this functionality, so that not all tags are being reset with every run, is on the roadmap of the maintainers of this adapter.
System information
The output of dbt --version:
Core:
- installed: 1.10.13
- latest: 1.11.2 - Update available!
Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
Plugins:
- databricks: 1.11.0 - Update available!
- spark: 1.9.3 - Update available!
At least one plugin is out of date with dbt-core.
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
The operating system you're using:
MacOS
The output of python --version:
Python 3.12.12
The dbt-databricks version we're using:
dbt-databricks==1.10.4
Describe the feature
The current situation is that all databricks-tags are applied every dbt run. This adds significantly to the time of a run. In our situation that means setting one tag takes around 300 milliseconds. We want to tag each column in our dbt project: setting the tags for this much columns doesn't scale very well. The production workload takes more time and it doesn't work well for developers who want to test their code.
We would like to know if it's possible that only new or changed databricks-tags will be applied, instead of all. As can be seen in the adapter, every time a table is created, all tags are set. In the apply_tags macro there is no option in which tags that have already been set are skipped (even though the tags retain).
We do see that the tags are being fetched already, but we do not see this used yet. We are wondering if this functionality, so that not all tags are being reset with every run, is on the roadmap of the maintainers of this adapter.
System information
The output of
dbt --version:The operating system you're using:
MacOS
The output of
python --version:Python 3.12.12
The dbt-databricks version we're using:
dbt-databricks==1.10.4