Commit 18376fe
feat(ol-dbt-cli): add SQL/YAML linting and column-level impact analysis CLI
Adds a new `ol-dbt` CLI package at `src/ol_dbt_cli/` with two commands:
## `ol-dbt validate`
Structural consistency checks for dbt models:
1. **YAML/SQL sync** — warns when columns declared in a model's YAML schema
are absent from its compiled SELECT output, or vice-versa.
2. **Missing upstream refs** — flags `ref()` or `source()` calls that
resolve to models/sources not present in the project.
3. **Upstream column existence** — validates that columns referenced from
an upstream model actually exist in that model's output.
4. **Duplicate column aliases** — detects accidental repeated alias names
within a single model.
5. **SELECT \*** — INFO-level notice for unresolved star expansions;
`--warn-select-star` promotes these to warnings.
Options:
- `--model/-m` (repeatable): filter to one or more model names, comma-separated
lists, or directory paths under `models/`
- `--errors-only`: suppress INFO/WARN output, show only errors
- `--warn-select-star`: treat unresolved SELECT * as warnings
- `--compiled-dir`: point at `target/compiled/` for Jinja-heavy models
- `--output json`: machine-readable output
## `ol-dbt impact`
Column-level downstream impact analysis driven by git diffs:
- Reads `git diff [--cached] [base]` to find changed/removed SQL column aliases
- Traverses the manifest or parsed SQL to find all downstream models that
reference each changed column
- Reports broken references, risky usages, and models needing review
## Jinja pre-processing
Robust regex-based Jinja stripping so sqlglot can parse the raw `.sql` files
without a running dbt environment:
- `{{ ref() }}` / `{{ source() }}` → stable SQL identifiers with reverse maps
- `{{ var() }}` and bare `{{ variable }}` → unquoted placeholders (avoids
the `''__jinja__''` doubled-quote bug when templates surround with SQL quotes)
- Block-level macro calls alone on a line → SQL comment
- Broken column expressions where a macro splits a SQL column boundary
(e.g. `{{ array_join('partial SQL', "path") }}`) → collapsed to
`__jinja__ as alias` via `_BROKEN_COL_RE` cleanup pass
Result: **588/588** dbt models in this project parse successfully.
## YAML registry improvements
- Parses both `models:` and `sources:` blocks for SELECT * resolution
- Rescues model definitions accidentally nested inside another model's
`columns:` list (YAML authoring error in `_stg_mitlearn_models.yml`)
- Filters out nested-model entries from the parent model's column list
## Tests
85 pytest tests covering all commands, lib modules, and edge cases.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent a1a24f7 commit 18376fe
18 files changed
Lines changed: 3331 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
| 19 | + | |
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
24 | | - | |
| 26 | + | |
25 | 27 | | |
26 | 28 | | |
27 | 29 | | |
| |||
Whitespace-only changes.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
Whitespace-only changes.
0 commit comments